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PREFACE 


In the usual undergraduate mathematical curriculum, the courses 
which a student takes during his first two years are those leading up 
to the calculus and the calculus itself. A few years ago, the depart¬ 
ment of mathematics at Dartmouth College decided to introduce a 
different kind of freshman course which students could elect along 
with these more traditional ones. The new course was to be designed 
to introduce a student to some concepts in modern mathematics early 
in his college career. While primarily a mathematics course, it was 
to include applications to the biological and social sciences and thus 
provide a point of view, other than that given by physics, concern¬ 
ing the possible uses of mathematics. 

In planning the proposed course, we found that there was no 
textbook available to fulfill our needs and, therefore, we decided to 
write such a book. Our aim was to choose topics which are initially 
close to the students 1 experience, which are important in modern day 
mathematics, and which have interesting and important applications. 
To guide us in the latter we asked for the opinions of a number of 
behavioral scientists about the kinds of mathematics a future be¬ 
havioral scientist might need. The main topics of the book were 
chosen from this list. 

Our purpose in writing the book was to develop several topics 
from a central point of view. In order to accomplish this on an 
elementary level, we restricted ourselves to the consideration of 
finite problems, that is, problems which do not involve infinite sets, 
limiting processes, continuity, etc. By so doing it was possible to go 
further into the subject matter than would otherwise be possible, and 
we found that the basic ideas of finite mathematics were easier to 
state and theorems about them considerably easier to prove than 
their infinite counterparts. 



VI 


PREFACE 


The first five chapters form a natural unit. The discussion o£ the 
« et ( ,f logical possibilities (in Chapter I) leads to the idea of the truth 
set associated with a statement which, in turn, gives a natural way 
of defining the probability that a statement is true (m Chapter IV. 
The correspondence that exists among logical operations (Chapter I), 
set operations (Chapter II), and probability operations (Chapter IV) 
becomes especially transparent in the finite case. A very useful peda¬ 
gogical device, that of a “tree’ 1 (a special type of diagram) is used hi 
these chapters and in the rest of the book to illustrate and clarify 
ideas. In particular, this allows an introduction to the theory of 
stochastic processes in an elementary manner. The Markov chains 
here introduced help to motivate vector and matrix theory, which is 
presented in Chapter V. 

In Chapter VI the student is introduced to two recent branches of 
mathematics that have proved useful in applications, namely, linear 
programming and the theory of games. We are able to explain the 
basic ideas of both relatively quickly because of the mathematical 
preparation given in the earlier chapters. 

In our concluding Chapter VII we discuss several significant appli¬ 
cations of mathematics to the behavioral sciences. These were se¬ 
lected for their interest both to mathematicians and to behavioral 
scientists. One topic was chosen from each of five sciences: sociology, 
genetics, psychology, anthropology, and economics. A reader may 
find it more difficult to read parts of this chapter than the earlier 
chapters, but it was found necessary to make it so in order that non¬ 
trivial applications could be taken up and pursued far enough to see 
the contributions mathematics makes. In teaching a course from our 
book we would not expect that all of the topics from this chapter 
would be used. We hope, however, that Chapter VII will, serve as 
reference and self-study material for ambitious students. 

The Committee on the undergraduate program of the Mathemati¬ 
cal Association of America was planning a new freshman mathematics 
program at the same time we were planning our book. They had al- 
ready written Part I of Universal Mathematics, which is an introduc¬ 
tion to analytic geometry and the calculus, and were making plans 
for Part II. When the chairman of that committee learned of the 
similarity of the plans for our book to those for Part II of Universal 
Mathematics, he invited one of us to join his committee. We believe 
that our book agrees with the spirit of their recommendations. We 
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are grateful for their permission to use some of their illustrations, of 
which the applications to voting problems are the principal ones. 

1 he report of the Committee on Mathematical Training of Social 
Scientists of the Social Science Research Council appeared after our 
plans were completed* We were pleased to note that on many ques¬ 
tions we had reached the same conclusions as had that committee* 
They recommend two years of training, about half in the calculus 
and half along the lines here discussed* A semester course based on 
our book together with a semester of calculus would give the student 
a distribution in the proportions recommended by that committee* 

The basic core of the book consists of the unasterisked sections of 
Chapters I-Y* This material should be covered in every course. Flex¬ 
ibility is provided by the inclusion of additional material, the optional 
(asterisked) sections of these chapters, Chapter VI, and Chapter VII. 
By emphasizing the first five chapters, the course would be a basic 
mathematics course. By aiming at Chapter VII and taking up several 
of these applications, the course can he designed as a mathematics 
course suited foi' the behavioral scientist* Chapter VI is appropriate 
as supplementary material for either type of course* Wc have in¬ 
cluded a bibliography at the end of each chapter to guide those 
interested in further reading. 

Ihe only prerequisite for this hook is the mathematical maturity 
obtained from two and a half or more years of high school mathe¬ 
matics* Our book has been tried successfully in a freshman course at 
Dartmouth College and for supplementary reading in other courses* 
It has also been used in a mathematics course for faculty members 
in the behavioral sciences* 

Wc wish to thank Dartmouth College for releasing us from part 
of our teaching duties to enable us to prepare this book. Thanks are 
also due to A, W, r \ uckcr for his valuable advice and to our colleagues 
in the mathematics department at Dartmouth for their many helpful 
suggestions* We are also grateful to .fames K* Schiller for reading the 
manuscript and for providing the reactions of a student* Finally we 
wish to thank Joan Snell, Margaret P* Andrews, and Stephen Russell 
for their invaluable aid in the preparation of the manuscript. 

J. G* K* 

J. L. S* 

G* L. T* 
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COMPOUND STATEMENTS 


1. PURPOSE OF THE THEORY 

A statement is a verbal or written assertion. In the English language 
such assertions are made by means of declarative sentences. For ex¬ 
ample, “It is snowing” and “I made a mistake in signing up for this 
course” are statements. 

The two statements quoted above are simple statements. A com¬ 
bination of two or more simple statements is a compound statement. 
For example, “It is snowing, and I wish that I were out of doors, but 
I made the mistake of signing up for this course,” is a compound 
statement. 

It might seem natural that one should make a study of simple 
statements first, and then proceed to the study of compound ones. 
However, the reverse order has proved to be more useful. Because of 
the tremendous variety of simple statements, the theory of such state¬ 
ments is very complex. It has been found in mathematics that it is 
often fruitful to assume for the moment that a difficult problem has 
been solved and then to go on to the next problem. Therefore we shall 
proceed as if we knew all about simple statements and study only 
the way they are compounded. The latter is a relatively easy prob¬ 
lem. 

While the first systematic treatment of such problems is found in 
the writings of Aristotle, mathematical methods were first employed 
by George Boole about 100 years ago. The more polished techniques 
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now available are the product of twentieth century mathematical 
logicians. 

The fundamental property of any statement is that it is either true 
or false (and that it cannot be both true and false). Naturally, we are 
interested in finding out which is the case. For a compound statement 
it is sufficient to know which of its components are true since the 
truth values (i.e., the truth or falsity) of the components determine 
in a way to be described later the truth value of the compound. 

Our problem then is twofold: (1) In how many different ways can 
statements be compounded? (2) How do we determine the truth value 
of a compound statement given the truth values of its components? 

Let us prepare our mathematical tools. In any mathematical for¬ 
mula we find three kinds of symbols: constants , variables , and auxiliary 
symbols . For example, in the formula (x + y) 2 the plus sign and the 
exponent are constants, the letters x and y are variables, and the 
parentheses are auxiliary symbols. Constants are symbols whose 
meanings in a given context are fixed. Thus in the formula given 
above, the plus sign indicates that we are to form the sum of the two 
numbers x and y, while the exponent 2 indicates that we are to multi¬ 
ply (x + y) by itself. Variables always stand for entities of a given 
kind, but they allow us to leave open just which particular entity we 
have in mind. In our example above the letters x and y stand for un¬ 
specified numbers. Auxiliary symbols function somewhat like punc¬ 
tuation marks. Thus if we omit the parentheses in the expression 
above we obtain the formula x + y 2 which has quite a different mean¬ 
ing than the formula (x + y) 2 . 

In this chapter we shall use variables of only one kind. We indicate 
these variables by the letters p, q, r , etc., which will stand for un¬ 
specified statements. These statements frequently will be simple 
statements but may also be compound. In any case we know that, 
since each variable stands for a statement, it has an (unknown) truth 
value. 

The constants that we shall use will stand for certain connectives 
used in the compounding of statements. We will have one symbol for 
forming the negation of a statement and several symbols for combin¬ 
ing two statements. It will not be necessary to introduce symbols for 
the compounding of three or more statements, since we can show that 
the same combination can also be formed by compounding them two 
at a time. In practice only a small number of basic constants are used 
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and the others are defined in terms of these. It is even possible to use 
only a single connective! (See Section 4, Exercises 10 and 11.) 

The auxiliary symbols that we shall use are, for the most part, the 
same ones used in elementary algebra. Any case where the usage is 
different will be explained. 

Examples. As examples of simple statements let us take “The 
weather is nice” and “It is very hot.” We will let p stand for the 
former and q for the latter. 

Suppose we wish to make the compound statement that both are 
true The weather is nice and it is very hot. We shall symbolize this 
statement by p A q. The symbol A, which can be read “and,” is our 
first connective. 

In place of the strong assertion above we might want to make the 
weak (cautious) assertion that one or the other of the statements is 
true: “The weather is nice or it is very hot.” We symbolize this 
assertion by p V q. The symbol V, which can be read “or,” is the 
second connective which we shall use. 

Suppose we believed that one of the statements above was false, 
for example, “It is not very hot.” Symbolically we would write 
Our third connective is then ~, which can be read “not.” 

More complex compound statements can now be made. For ex¬ 
ample, p A stands for “The weather is nice and it is not very hot.” 


EXERCISES 

1. The following are compound statements or may be so interpreted. 
Find their simple components. 

(a) It is hot and it is raining. 

(b) It is hot but it is not very humid. 

[Ans. “It is hot”; “it is very humid.”] 

(c) It is raining or it is very humid. 

(d) Jack and Jill went up the hill. 

(e) The murderer is Jones or Smith. 

(f) It is neither necessary nor desirable. 

(&) Either Jones wrote this book or Smith did not know who the 
author was. 

2. In Exercise 1 assign letters to the various components, and write the 

statements in symbolic form. [Arts, (b) p A ~q.] 
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3. Write the following statements in symbolic form, letting p be “Fred 
is smart” and q be “George is smart.” 

(a) Fred is smart and George is stupid. 

(b) George is smart and Fred is stupid. 

(c) Fred and George are both stupid. 

(d) Either Fred is smart or George is stupid. 

(e) Neither Fred nor George is smart. 

(f) Fred is not smart, but George is stupid. 

(g) It is not true that Fred and George are both stupid. 

4. Assume that Fred and George are both smart. Which of the seven 
compound statements in Exercise 3 are true? 

5. Write the following statements in symbolic form. 

(a) Fred likes George. (Statement p.) 

(b) George likes Fred. (Statement q.) 

(c) Fred and George like each other. 

(d) Fred and George dislike each other. 

(e) Fred likes George, but George does not reciprocate. 

(f) George is liked by Fred, but Fred is disliked by George. 

(g) Neither Fred nor George dislikes the other. 

(h) It is not true that Fred and George dislike each other. 

6. Suppose that Fred likes George and George dislikes Fred. Which of 
the eight statements in Exercise 5 are true? 

* 7. For each statement in Exercise 5 give a condition under which it is 

f a l se# [Arts, (c) Fred does not like George.] 

8. Let p be “Stock prices are high,” and q be “Stocks are rising.” Give 
a verbal translation for each of the following. 

(a) p A q. 

(b) p A 

(c) A ~q. 

(d) p V ~q. 

(e) ~(p A q). 

(f) ~<pVff). 

( g ) ~(~p V ~q). 

9. Using your answers to Exercise 8, parts (e), (f), (g), find simpler 
symbolic statements expressing the same idea. 

10. Let p be “I have a dog,” and q be “I have a cat.” Translate into 
English and simplify*, ~[~p V ^ ~q] A ^ 
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2. THE MOST COMMON CONNECTIVES 

The truth value of a compound statement is determined by the 
truth values of its components. When discussing a connective we will 
want to know just how the truth of a compound statement made 
from this connective depends upon the truth of its components. A 
very convenient way of tabulating this dependency is by means of a 
truth table. 

Let us consider the compound p A q. Statement p could be either 
true or false and so could statement q. 

Thus there are four possible pairs of truth 
values for these statements and we want 
to know in each case whether or not the 
statement p A q is true. The answer is 
straightforward: If p and q are both true, 
then p A q is true, and otherwise p A q 
is false. This seems reasonable since the 
assertion p A q says no more and no less 
than that p and q are both true. 

Figure 1 gives the truth table for 
P A q, the conjunction of p and q. The 
truth table contains all the informa¬ 
tion that we need to know about the 
connective A, namely it tells us the 
truth value of the conjunction of two* 
statements given the truth values of 
each of the statements. 

We next look at the compound 
statement p V q, the disjunction of p 
and q. Here the assertion is that one or the other of these state¬ 
ments is true. Clearly, if one statement is true and the other false, 
then the disjunction is true, while if both statements are false, then 
the disjunction is certainly false. Thus we can fill in the last three 
rows of the truth table for disjunction (see Figure 2). 

Observe that one possibility is left unsettled, namely, what happens: 
if both components are true? Here we observe that the everyday 
usage of or is ambiguous. Does “or” mean “one or the other or 
both” or does it mean “one or the other but not both”? 

Let us seek the answer in examples. The sentence “this summer I 


V 

q 

V v q 

T 

T 

? 

T 

F 

T 

F 

T 

T 

F 

F 

F 


Figure 2 


v q 

p A q 

T T 

T 

T F 

F 

F T 

F 

F F 

F 


Figure 1 
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will date Jean or Pat,” allows for the possibility that the speaker may 
date both girls. However the sentence “I will go to Dartmouth or to 
Princeton,” indicates that only one of these schools will be chosen. 
“I will buy a TV set or a phonograph next year,” could be used in 
either sense; the speaker may mean that he is trying to make up his 
mind which one of the two to buy, but it could also mean that he will 
buy at least one of these—possibly both. We see that sometimes the 
context makes the meaning clear but not always. 

A mathematician would never waste his time on a dispute as to 
which usage “should” be called the disjunction of two statements. 
Rather he recognizes two perfectly good usages, and calls one the in¬ 
clusive disjunction (p or q or both) and the other the exclusive disjunc¬ 
tion (p or q but not both). The symbol V will be used for inclusive 
disjunction, and the symbol will be used for exclusive disjunction. 
The truth tables for each of these are found in Figures 3 and 4 below. 


V 

<1 

pV q 

T 

T 

T 

T 

F 

T 

F 

T 

T 

F 

F 

F 


Figure 3 


V 

<1 

p \Lq 

T 

T 

F 

T 

F 

T 

F 

T 

T 

F 

F 

F 


Figure 4 


Unless we state otherwise, our disjunctions will be inclusive disjunc¬ 
tions. 

The last connective which we shall discuss in 


this section is negation . If p is a statement, the 
symbol called the negation of p , asserts 
that p is false. Hence ~p is true when p is 
false, and false when p is true. The truth table 
for negation is shown in Figure 5. 

Besides using these basic connectives singly 
to form compound statements, several can be 
used to form a more complicated compound statement, in much 
the same way that complicated algebraic expressions can be formed 
by means of the basic arithmetic operations. For example, A q), 

p A ~Pj and (p V q) V ~p are all compound statements. They 


V 


T 

F 

F 

T 


Figure 5 
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are to be read “from the inside out” in the same way that algebraic 
expressions are, namely, quantities inside the innermost parentheses 
are first grouped together, then these parentheses are grouped 
together, etc. Each compound statement has a truth table which can 
be constructed in a routine way. The following examples show how 
to construct truth tables. 

Example 1. Consider the compound statement p V ~q. We begin 
the construction of its truth table 
by writing in the first two columns 
the four possible pairs of truth 
values for the statements p and 
q. Then we write the proposi¬ 
tion in question, leaving plenty 
of space between symbols so that 
we can fill in columns below. 

Next we copy the truth values of 
p and q in the columns below 
their occurrences in the proposi¬ 
tion. This completes step 1 of the construction, see Figure 6. 

Next we treat the innermost compound, the negation of the variable 
q, completing step 2, see Figure 7. 


V 

q 

p V ~q 

T 

T 

T T 

T 

F 

T F 

F 

T 

F T 

F 

F 

F F 

Step No. 

1 1 


Figure 6 


P 

q 

V 

v ~ 

q 

T 

T 

T 

F 

T 

T 

F 

T 

T 

F 

F 

T 

F 

F 

T 

F 

F 

F 

T 

F 

Step No. 

1 

2 

1 


Figure 7 


Finally we fill in the column under the disjunction symbol, which 
gives us the truth value of the compound statement for various truth 
values of its variables. To indicate this we place two parallel lines 
on each side of the final column, completing step 3 as in Figure 8. 
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V 

< 1 

V 

V 

rsJ q 

T 

T 

T 

T 

F T 

T 

F 

T 

T 

T F 

F 

T 

F 

F 

F T 

F 

F 

F 

T 

T F 

Step No. 

1 

3 

2 1 


Figure 8 


The next two examples show truth tables of more complicated com¬ 
pounds worked out in the same manner. There are only two basic 
rules which the student must remember when working these: first, 
work from the “inside out”; and second, the truth values of the com¬ 
pound statement are found in the last column filled in during this 
procedure. 


Example 2. The truth table for the statement (p V ~q) A 
together with the numbers indicating the order in which the columns 
are filled in appears in Figure 9. 


V 

<1 

(P 

V 


«) 

A 


V 

T 

T 

T 

T 

F 

T 

F 

F 

T 

T 

F 

T 

T 

T 

F 

F 

F 

T 

F 

T 

F 

F 

F 

T 

F 

T 

F 

F 

F 

F 

T 

T 

F 

T 

T 

F 

Step No. 

1 

3 

2 

1 

4 

2 

1 


Figure 9 


Example 3. The truth table for the statement ~[(p A q) V 
(~p A together with the numbers indicating the order in which 
the columns are filled appears in Figure 10. 
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V 

2 


t(p 

A 

q) 

V 


V 

A 


9)] 

T 

T 

F 

T 

T 

T 

T 

F 

T 

F 

F 

T 

T 

F 

T 

T 

F 

F 

F 

F 

T 

F 

T 

F 

F 

T 

T 

F 

F 

T 

F 

T 

F 

F 

F 

T 

F 

F 

F 

F 

F 

F 

T 

T 

F 

T 

T 

F 

Step No. 

5 

1 

3 

1 

4 

2 

1 

3 

2 

1 


Figure 10 


EXERCISES 


!• Give a compound statement which symbolically states “p or q but 
not both,” using only V, and A. 

2. Construct the truth table for your answer to Exercise 1, and compare 
this with Figure 4. 


3. Construct the truth table for the symbolic form of each statement in 
Exercise 3 of Section 1. How does Exercise 4 of Section 1 relate to these 
truth tables? 


4. Construct a truth table for each of the following: 

(a) ~(p A q). 

(b) p A ~p. 

(c) (p V q) V ~p. 

(d) ~[(p V q) A (~p V ~q)]. 


[Ans. FTTT.] 
[Ans. FF.] 
[Ans. TTTT.] 
[Am. TFFT.] 


5. Let p stand for “Jones passed the course” and q stand for “Smith 
passed the course” and translate into symbolic form the statement “It is 
not the case that Jones and Smith both failed the course.” Construct a 
truth table for this compound statement. State in words the circumstances 
under which the statement is true. 


6. Construct a simpler statement about Jones and Smith that has the 
same truth table as the one in Exercise 5. 


7. Let p | q express that “p and q are not both true.” Write a symbolic 
expression for p | q using ^ and A. 

8. Write a truth table for p | q. 

9. Write a truth table for p | p. [Ans. Same as Figure 5.] 

10. Write a truth table for (p | q) | (p | q). [Ans. Same as Figure 1.] 


i 
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11. Construct a truth table for each of the following: 

(a) ~(p V q) V ~(q V p). 

(b) ~(p V q) A p. 

(c) ~(p Y g). 

(d) ~(p | q). 


[Ans. FFFT.] 
[Arts. FFFF.] 
[Ans. TFFT.] 
[Ans. TFFF.] 


12. Construct two symbolic statements, using only V, and A, which 
have the truth tables (a) and (b), respectively: 


V 

q 

(a) 

(b) 

T 

T 

T 

T 

T 

F 

F 

F 

F 

T 

T 

F 

F 

F 

T 

T 


3. OTHER CONNECTIVES 

Suppose we did not wish to make an outright assertion but rather 
an assertion containing a condition. As examples consider the follow¬ 
ing sentences. “If the weather is nice, I will take a walk.” “If the 
following statement is true, then I can prove the theorem.” “If the 
cost of living continues to rise, then the government will impose rigid 
curbs.” Each of these statements is of the form “if p then q” The 
conditional is then a new connective which is symbolized by the 
arrow —>. 

Of course the precise definition of this new connective must be made 
by means of a truth table. If both p and q are true, then p —> q is 
certainly true, and if p is true and q false, then p —> q is certainly 
false. Thus the first two lines of the truth table can easily be filled in, 
see Figure 11a. Suppose now that p is false; how shall we fill in the 


V 

q 

p-»3 

T 

T 

T 

T 

F 

F 

F 

T 

T 

F 

F 

T 


V 

q 

p-*q 

T 

T 

T 

T 

F 

F 

F 

T 

? 

F 

F 

? 


Figure 11a 


Figure lib 
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last two lines of the truth table in Figure 11a? At first thought one 
might suppose that it would be best to leave it completely undefined. 
However, to do so would violate our basic principle that a statement 
is either true or false. 

Therefore we make the completely arbitrary decision that the condi¬ 
tional, p —> q y is true whenever p is false, regardless of the truth value 
of q. This decision enables us to complete the truth table for the 
conditional and it is given in Figure lib. A glance at this truth table 
shows that the conditional p —> q is considered false only if p is true 
and q is false. If we wished we might rationalize the arbitrary decision 
made above by saying that if statement p happens to be false then we 
give the conditional p —> q the “benefit of the doubt” and consider 
it true (see Exercise 1). 

In everyday conversation it is customary to combine simple state¬ 
ments only if they are somehow related. Thus we might say “It is 
raining today and I will take an umbrella,” but we would not say 
“I read a good book and I will take an umbrella.” However, the rather 
ill-defined concept of relatedness is difficult to enforce. Concepts re¬ 
lated to each other in one person’s mind need not be related in an¬ 
other’s. In our study of compound statements no requirement of 
relatedness is imposed on two statements in order that they be com¬ 
pounded by any of the connectives. This freedom sometimes produces 
strange results in the use of the conditional. For example, according 
to the truth table in Figure lib the statement “If 2X2 = 5, then 
black is white” is true, while the statement “If 2X2 = 4, then cows 
are monkeys” is false. Since we use the “if . . . then . . form 
usually only when there is a causal connection between the two 
statements, we might be tempted to 
label both of the above statements as 
nonsense. At this point it is important 
to remember that no such causal con¬ 
nection is intended in the usage of — 
the meaning of the conditional is con¬ 
tained in Figure lib and nothing more 
is intended. This point will be dis¬ 
cussed again in Section 7 in connection 
with implication. 

Closely connected to the conditional connective is the biconditional 
statement, p <-> g, which may be read “p if and only if g.” The bi- 


V 

<7 

p<-+q 

T 

T 

T 

T 

F 

F 

F 

T 

F 

F 

F 

T 


Figure 12 
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conditional statement asserts that if p is true, then q is true, and if p 
is false, then q is false. Hence the biconditional is true in these cases 
and false in the others so that its truth table can be filled in as in 
Figure 12. 

The biconditional is the last of the five connectives which we shall 
use in this chapter. The table below gives a summary of them to¬ 
gether with the numbers of the figures giving their truth tables. 


Name 

Symbol 

Translated as 

Truth Table 

Conjunction 

A 

“and” 

Figure 1 

Disjunction 

(inclusive) 

V 

“or” 

Figure 3 

Negation 


“not” 

Figure 5 

Conditional 

-» 

“if . . . then . . 

Figure lib 

Biconditional 

<-> 

. . if and only if . . 

Figure 12 


Remember that the complete definition of each of these connectives 
is given by its truth table. The examples below show the use of the 
two new connectives. 

Examples. In Figures 13 and 14 the truth tables of two statements 
are worked out following the procedure of Section 2. 


V 

<2 

V 

- 

(P 

V 

a) 

T 

T 

T 

T 

T 

T 

T 

T 

F 

T 

T 

T 

T 

F 

F 

T 

F 

T 

F 

T 

T 

F 

F 

F 

T 

F 

F 

F 

Step No. 

1 

3 

1 

2 

1 


Figure 13 


It is also possible to form compound statements from three or more 
simple statements. The next example is a compound formed from 
three simple statements p, q , and r. Notice that there will be a total 
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p 

q 

/*N*/ 

P 


C V 

- 

nsj 

3) 

T 

T 

F 

T 

T 

T 

F 

F 

T 

T 

F 

F 

T 

F 

T 

T 

T 

F 

F 

! T 

T 

F 

T 

F 

T 

F 

T 

F 

F 

T 

F 

T 

F 

T 

T 

F 

Step No. 

2 

1 

4 

1 

3 

2 

1 


Figure 14 


of eight possible triples of truth values for these three statements so 
that the truth table for our compound will have eight rows as shown 
in Figure 15. 


P 

q 

r 

[? 

- 

(3 

V 

r)] 

A 


[p 



r] 

T 

T 

T 

T 

T 

T 

T 

T 

T 

T 

T 

F 

F 

T 

T 

T 

F 

T 

T 

T 

T 

F 

F 

F 

T 

T 

T 

F 

T 

F 

T 

T 

T 

F 

T 

T 

T 

T 

T 

F 

F 

T 

T 

F 

F 

T 

F 

F 

F 

F 

F 

F 

T 

T 

T 

F 

F 

T 

T 

F 

T 

T 

T 

T 

F 

F 

F 

T 

F 

T 

F 

T 

F 

F 

T 

T 

T 

F 

T 

T 

F 

F 

T 

F 

F 

F 

T 

F 

T 

F 

T 

T 

F 

F 

F 

T 

F 

T 

F 

F 

F 

F 

T 

F 

F 

F 

T 

T 

F 

F 

T 

F 

Step No. 

1 

3 

1 

2 

1 

5 

4 

1 

3 

2 

1 


Figure 15 


EXERCISES 

1. One way of filling in the question-marked squares in Figure 11a is 
given in Figure lib. There are three other possible ways. 

(a) Write the three other truth tables. 

(b) Show that each one of these truth tables has an interpretation in 
terms of the connectives now available to us. 

2. Write truth tables for q V P, q A p, q p, q ^ p. Compare these 
with the truth tables in Figures 3, 1, lib, and 12, respectively. 
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3. Construct truth tables for: 

(a) p -> (g V r). 

(b) (pVr) A (p->g). 

(c) (pV^(gV p). 

(d) p A ~p. 

(e) (p -» p) V (p -> ~p). 

(f) (p V ~g) A r. 

(g) [p -* (g -> r)] -> [(p -► g) -> (p -> r)]. 


[Ans. TTTFTTTT.] 
[Ans. TTFFTFTF.] 
[Ans. TTTT.] 
[Ans. FF.] 
[Ans. TT.] 
[Ans. TFTFFFTF.] 
[Ans. TTTTTTTT.] 


4. For each of the following statements (i) find a symbolic form, and 
(ii) construct the truth table. Use the notation: p for “Joe is smart,” q for 
“Jim is stupid,” r for “Joe will get the prize.” 

(a) If Joe is smart and Jim is stupid, then Joe will get the prize. 

[Ans. TFTTTTTT.] 

(b) Joe will get the prize if and only if either he is smart or Jim is 

stupid. [Ans. TFTFTFFT.] 

(c) If Jim is stupid but Joe fails to get the prize, then Joe is not smart. 

[Ans. Same as (a).] 


5. Construct truth tables for each of the following, and give an interpre¬ 
tation. 

(a) (p —> q) A (g —* p). (Compare with Figure 12.) 

(b) (p A g) -> p. 

(c) q -> (p V g). 

(d) (p g) <-* (~p V g). 

6. The truth table for a statement compounded from two simple state¬ 
ments has four rows, and the truth table for a statement compounded from 
three simple statements has eight rows. How many rows would the truth 
table for a statement compounded from four simple statements have? How 
many for five? For n? Devise a systematic way of writing down these latter 
truth tables. 


7. Let p be “It is raining,” and q be “The wind is blowing.” Translate 
each of the following into symbolic form. 

(a) If it rains, then the wind blows. 

(b) If the wind blows, then it rains. 

(c) The wind blows if and only if it rains. 

(d) If the wind blows, then it does not rain. 

(e) It is not the case that the wind blows if and only if it does not rain. 

8. Construct truth tables for the statements in Exercise 7. 

[Ans. TFTT, TTFT, TFFT, FTTT, TFFT.] 

9. Construct a truth table for 

(a) (p V q) <-+ (~r A ^s). 

(b) (p A g) —» ~[~p A (r V s)]. 
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10. Construct a truth table for ~[(~p A ~q) A (p V r)]. 

[Arts. TTTTTTFT.] 

11. Find a simpler statement having the same truth table as the one 
found in Exercise 10. 

*4. STATEMENTS HAVING GIVEN TRUTH TABLES 

In the preceding two sections we showed how to construct the truth 
table for any compound statement. It is also interesting to consider 
the converse problem, namely, given a truth table to find one or more 
statements having this truth table. The converse problem always has 
a solution, and in fact, a solution using only the connectives A, V, 
and The discussion which we give here is valid only for a truth 
table in three variables but can easily be extended to cover the case 
of n variables. 

As observed in the last section a truth table with three variables 
has eight rows, one for each of the eight possible triples of truth values. 
Suppose that our given truth table has its last column consisting en¬ 
tirely of F’s. Then it is easy to check that the truth table of the state¬ 
ment p A ~p also has only F’s in its last column, so that this 
statement serves as an answer to our problem. We now need consider 
only truth tables having one or more T’s. The method that we shall 
use is to construct statements that are true in one case only, and then 
to construct the desired statement as a disjunction of these. 

It is not hard to construct statements that are true in only one case. 
In Figure 16 are listed eight such statements, each true in exactly 


V 

<l 

r 

Basic Conjunctions 

T 

T 

T 

> 

> 

T 

T 

F 

V A q A 

T 

F 

T 

p A ~q A r 

T 

F 

F 

V A ~q A 

F 

T 

T 

~p A q A r 

F 

T 

F 

A q A 

F 

F 

T 

A ~q A r 

F 

F 

F 

A ~q A 


Figure 16 
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one case. We shall call such statements basic conjunctions . Such a 
basic conjunction contains each variable or its negation, depending 
on whether the line on which it appears in Figure 16 has a T or an F 
under the variable. Observe that the disjunction of two such basic 
conjunctions will be true in exactly two cases, the disjunction of three 
in three cases, etc. Therefore, to find a statement having a given truth 
table simply form the disjunction of those basic conjunctions which 
occur in Figure 16 on the row’s where the given truth table has T's. 

Example 1 . Find a statement whose truth table has T's in the 
first, second, and last rows, and F’s in the other rows. The required 
statement is the disjunction of the first, second, and eighth basic con¬ 
junctions, that is 

(p A q A r) V (p A q A ~r) V (~p A A ~r). 

In Exercise 2 you will show that this statement has the required truth 
table. 

Example 2. A logician is captured by a tribe of savages and placed 
in a jail having two exits. The savage chief offers the captive the 
following chance to escape: “One of the doors leads to certain death 
and the other to freedom. You can leave by either door. To help you 
in making a decision, two of my warriors will stay with you and 
answer any one question which you wish to ask of them. I must warn 
you, however, that one of my warriors is completely truthful while 
the other always lies.” The chief then leaves, believing that he has 
given his captive only a sporting chance to escape. 

After thinking a moment our quick-witted logician asks one ques¬ 
tion and then chooses the door leading to freedom. What question 
did he ask? 

Let p be the statement “The first door leads to freedom,” and q 
be the statement “You are truthful.” It is clear that p and q are 
useless questions in themselves, so let us try compound statements. 
We want to ask a single question for which a “yes” answer means 
that p is true and a “no” answer means that p is false, regardless of 
which warrior is asked the question. The answers desired to these 
questions are listed in Figure 17. 

The next thing to consider is, what would be the truth table of a 
question having the desired answers. If the warrior answers “yes” 
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V 

q 

Desired 

Answer 

Truth Table 
of Question 

T 

T 

yes 

T 

T 

F 

yes 

F 

F 

T 

no 

F 

F 

F 

no 

T 


Figure 17 


and if he is truthful, that is if q is true, then the truth value is T. 
But if he answers “yes” and he is a liar, that is if q is false, then the 
truth value is F. A similar analysis holds if the answer is “no.” The 
truth values of the desired question are shown in Figure 17. 

Therefore we have reduced the problem to that of finding a state¬ 
ment having the truth table of Figure 17. Following the general 
method outlined above, we see that the statement 

(p A q) V (~p A ~g) 

will do. Hence the logician asks the question: “Does the first door 
lead to freedom and are you truthful, or does the second door lead to 
freedom and are you lying?” The reader can show (Exercise 3) that 
the statement p «-» q also has the truth table given in Figure 17, hence 
a shorter equivalent question would be: “Does the first door lead to 
freedom if and only if you are truthful?” 

As can be seen in Example 2, the method does not necessarily yield 
the simplest possible compound statement. However it has two ad¬ 
vantages: (1) It gives us a mechanical method of finding a statement 
that solves the problem. (2) The statement appears in a standard 
form. The latter will be made use of in designing switching circuits 
(see Section 12). 


EXERCISES 

!• Show that each of the basic conjunctions in Figure 16 has a truth 
table consisting of one T appearing in the row in which the statement ap¬ 
pears in Figure 16, and all the rest F’s. 

Find the truth table of the compound statement constructed in Ex¬ 
ample 1. 
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3. Show in Example 2 that the statement p q has the truth table of 
Figure 17. 

4. Construct one or more compound statements having each of the 
following truth tables, (a), (b), and (c). 


V 

q 

r 

(a) 

(b) 

(c) 

T 

T 

T 

T 

F 

T 

T 

T 

F 

F 

F 

T 

T 

F 

T 

T 

F 

T 

T 

F 

F 

F 

T 

F 

F 

T 

T 

F 

F 

T 

F 

T 

F 

F 

F 

T 

F 

F 

T 

T 

F 

F 

F 

F 

F 

F 

F 

T 


5* Using only V, A and write a statement equivalent to each of the 
following: 

(a) p <-> q. 

(b) p -* q. 

(c) 

6. Using only V and ~ write down a statement equivalent to p A q. 
Use this result to prove that any truth table can be represented by means 
of the two connectives V and 

In Exercises 7-10 we will study the new connective [ , where p[q 
expresses “neither p nor q.” 

7. Construct the truth table of p [ q. 

8. Construct the truth table for pip. What other compound has this 

truth table? l Ans ' Same as Fi S ure 

9. Construct the truth table for ( piq ) 1 (plq). What other com¬ 
pound has this truth table? [Ans. Same as Figure 3.] 

10. Use the results of Exercises 6, 8 , and 9 to show that any truth table 
can be represented by means of the single connective j • 

11. Use the results of Exercises 9, 10 following Section 2 to show that any 
truth table can be represented by means of the single connective |. 

12. Write down a compound of p, q, t which is true if and only if exactly 
one of the three components is true. 
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13. The “basic conjunctions” for statements having only one variable are 
p and ~p. Discuss the various compound statements that can be formed 
by disjunctions of these. How do these relate to the possible truth tables for 
statements of one variable? What can be asserted about an arbitrary com ¬ 
pound, no matter how long, that contains only the variable p ? 

[Arts. There are four possible truth tables.] 

14. In Example 2 there is a second question, having a different truth table 
than that in Figure 17, which the logician can ask. What is it? 

15. A student is confronted with a true-false exam, consisting of five 

questions. He knows that his instructor always has more true than false 
questions, and that he never has three questions in a row with the same 
answer. From the nature of the first and last questions he knows that these 
must have the opposite answer. The only question to which he knows the 
answer is number two. And this assures him of having all answers correct. 
What did he know about question two? What is the answer to the five 
questions? [Ans. TFTTF.] 


5. LOGICAL POSSIBILITIES 

One of the most important contributions that mathematics can 
make to the solution of a scientific problem is to provide an exhaustive 
analysis of the logical possibilities for the problem. The role of science 
is then to discover facts which will eliminate all but one possibility. 

Given the analysis of logical possibilities, we can ask for each asser¬ 
tion about the problem, and for each logical possibility, whether the 
assertion is true in this case. Normally, for a given statement there 
will be many cases in which it is true and many in which it is false. 
Logic will be able to do no more than to point out the cases in which 
the statement is true. However, there are two notable exceptions, 
namely, a statement that is true in every logically possible case, and 
one that is false in every case. Here logic alone suffices to determine 
the truth value. 

A statement that is true in every logically possible case is said to be 
logically true . The truth of such a statement follows from the meaning 
of the words and the form of the statement, together with the context 
of the problem about which the statement is made. We will see several 
examples of logically true statements below. A statement that is false 
in every logically possible case is said to be logically false, or to be a 
self-contradiction . For example, the conjunction of any statement 
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with its own negation will always be a self-contradiction, since it can¬ 
not be true under any circumstances. 

Example 1. Let us consider the following problem, which is of a 
type often studied in probability theory. “There are two urns; the 
first contains two black balls and one white ball, while the second 
contains one black ball and two white balls; select an urn at random 
and draw two balls in succession from it. What is the probability 
that. . . ?” Without raising questions of probability, let us ask what 
the possibilities are. Figures 18 and 19 give us two ways of analyzing 
the logical possibilities. 


Case 

Um 

First Ball 

Second Ball 

1 

1 

black 

black 

2 

1 

black 

white 

3 

1 

white 

black 

4 

' 2 

black 

white 

5 

2 

white 

black 

6 

2 

white 

white 


Figure 18 


In Figure 18 we have analyzed the possibilities as far as colors of 
balls drawn was concerned. Such an analysis may be sufficient for 
many purposes. In Figure 19 we have carried out a finer analysis, in 
which we distinguished between balls of the same color in an urn. 
For some purposes the finer analysis may be necessary. It is impor¬ 
tant to realize that the possibilities in a given problem may be ana¬ 
lyzed in many different ways, from a very rough grouping to a highly 
refined one. The only requirements on an analysis of logical possibili¬ 
ties are: (1) That under any conceivable circumstances one and only 
one of these possibilities must be the case, and (2) that the analysis 
is fine enough so that the truth value of each statement (under con¬ 
sideration in the problem) is determined in each case. 

It is easy to verify that both analyses (Figures 18 and 19) satisfy 
the first condition. Whether the rougher analysis will satisfy the 
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second condition depends on the nature of the problem. If we can 
limit ourselves to statements like “Two black balls are drawn from 
the first urn,” then it suffices. But if we wish to consider “The first 
black ball is drawn after the second black ball from the second urn,” 
then the finer analysis is needed. 


Case 

Urn 

First Ball 

Second Ball 

1 

1 

black no. 1 

black no. 2 

2 

1 

black no. 2 

black no. 1 

3 

1 | black no. 1 

white 

4 

1 

black no. 2 

white 

5 

1 

white 

black no. 1 

6 

1 

white 

black no. 2 

7 

2 

black 

white no. 1 

8 

2 

black 

white no. 2 

9 

2 

white no. 1 

black 

10 

2 

white no. 2 

black 

11 

2 

white no. 1 

white no. 2 

12 

2 

white no. 2 

white no. 1 


Figure 19 


Let us consider the statement “One white ball and one black are 
drawn.” In Figure 18 this will be true in cases 2, 3, 4, and 5, false in 
cases 1 and 6. For any statement we will normally find a number of 
cases in which it is true, others in which it is false. However, an ex¬ 
ception to this is a statement like “At most two black balls are 
drawn,” which is true in every case, in either analysis. Hence this 
statement is logically true. It follows from the very definition of the 
problem that we cannot draw more than two balls. Hence, also, the 
statement “Draw three white balls,” is logically false. 
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What the logical possibilities are for a given set of statements will 
depend on the context, i.e., on the problem that is being considered. 
Unless we know what the possibilities are, we have not understood 
the task before us. This does not preclude the possibility that there 
may be several ways of analyzing the logical possibilities. In Example 
1 above, e.g., we gave two different analyses, and others could be 
found. In general, the question “How many cases are there in which 
p is true,” will depend on the analysis given. (This will be of impor¬ 
tance in our study of probability theory.) However, the logically 
true and logically false statements are exceptions. A statement that 
is logically true (false) according to one analysis will be logically true 
(false) according to every other analysis of the given problem. 

The truth table method was used at the beginning of the chapter 
for analyzing logical possibilities and this is a very rough but con¬ 
venient method. Suppose that we have a statement compounded 
from p, q , and r . There may be a hundred cases possible according to 
some fine analysis. But some of these can be lumped together, since 
it is necessary only to distinguish those cases where the truth values 
of the three components turn out to be different. Then we can get 
at most eight cases, corresponding to the eight lines of the truth table. 

In the truth tables we have always assumed, tacitly, that all these 
cases will occur, which amounts to saying that the components are 
logically unrelated (see Section 8). If this assumption is satisfied, the 
truth table is a perfectly satisfactory means of testing for logically 
true (false) statements. Since the truth value of the compound de¬ 
pends only on the truth values of its components, no other information 
is relevant. Hence the eight cases suffice for the testing of logically 
true (false) statements. 

For example, a statement of the form p —* (p V q ) will have to be 
true in every conceivable case. We may have a hundred cases, giving 
varying truth values for p and q , but every such case must correspond 
to one of the four truth table cases, as far as the compound is con¬ 
cerned. In each of these four cases the compound is true, and there¬ 
fore such a statement is logically true. An example of it is “If Jones 
is smart, then he is smart or lucky.” 

However, if the components are logically related, then a truth table 
analysis may not be adequate. Let p be the statement “Jim is taller 
than Bill,” while q is “Bill is taller than Jim.” And consider the 
statement, “Either Jim is not taller than Bill or Bill is not taller than 
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Jim,” i.e., r^p v ~q m If we work the truth table of this compound, 
we find that it is false in the first case. But this case is not logically 
possible, since under no circumstances can p and q both be true! Our 
compound is logically true, but a truth table will not show this. Had 
we made a careful analysis of the possibilities as to the heights of the 
two men, we would have found that the compound statement is true 
in every case. (Such relations will be considered in Section 8. This 
particular pair of statements will be considered in Exercise 11 in that 
section.) 

Example 2. As a more complicated example let us consider the 
classification of human beings according to height, hair color, and 
sex that is carried out in Figure 20. Whether this analysis into 24 
cases is adequate will depend on the problem. For example, if we 


Case 

Height 

Hair Color 

Sex 

1 

tall 

blond 

male 

2 

tall 

blond 

female 

3 

tall 

brown 

male 

4 

tall 

brown 

female 

5 

tall 

black 

male 

6 

tall 

black 

female 

7 

tall 

red 

male 

8 

tall 

red 

female 

9 

medium 

blond 

male 

10 

medium 

blond 

female 

11 

medium 

brown 

male 

12 

medium 

brown 

female 

13 

medium 

black 

male 

14 

medium 

black 

female 

15 

medium 

red 

male 

16 

medium 

red 

female 

17 

short 

blond 

male 

18 

short 

blond 

female 

19 

short 

brown 

male 

20 

short 

brown 

female 

21 

short 

black 

male 

22 

short 

black 

female 

23 

short 

red 

male 

24 

short 

red 

female 


Figure 20 
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want to allow for white hair or baldness, we must have more cases. 

The statement “He is a tall man,” is true in cases 1, 3, 5, 7, and 
false in the others. “She is a woman who is neither short nor red- 
haired” will be true in cases 2, 4, 6, 10,12, and 14. On the other hand, 
the statement “The person is tall, medium, or short,” furnishes no 
information. It is true in every case, hence logically true. On the 
other hand the statement “He is a man of less than medium height, 
not blond, brown, or red-haired, and not a short black-haired man 
is a self-contradiction. 

Of all the logical possibilities, one and only one represents the facts 
as they are. That is, for a given person one and only one of the 24 cases 
is a correct description. To know which one, we need factual infor¬ 
mation. When we say that a certain statement is “true,” without 
■qualifying it, we mean that it is true in this one case. But, as we have 
said before, what the case actually is lies outside the domain of logic. 
Logic can tell us only what the circumstances (logical possibilities) 
are under which a statement is true. 

EXERCISES 

1. Prove that the negation of a logically true statement is logically false, 
and the negation of a logically false statement is logically true. 

2. Classify the following as (i) logically true, (ii) a self-contradiction, 
(iii) neither. 

( a ) p <-> p. [Arts. Logically true.] 

(b) p —> ~p. 

(c) (pVg)H(p A q). i Ans - Neither.] 

(d) (p ~q) —►(?-* ~P)- 

(p g) /\ (q r) A ~(p —> r). [Ans. Self-contradiction.] 

(f) (p -> q) -> p. 

(g) [(p -► Q) -> Pi P- 

3. Figure 20 gives the possible classifications of one person according to 

height, color of hair, and sex. How many cases do we get if we classify two 
people jointly? [Ans. 576.] 

4. For each of the 24 cases in Figure 20 state whether the following state¬ 
ment is true: “The person has red hair, and, if the person is a woman, then 
she is short.” 

5. In Example 1, with the logical possibilities given by Figure 18, state 
the cases in which the following statements are true. 
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(a) Urn one is selected. 

(b) At least one white ball is drawn. 

(c) At most one white ball is drawn. 

(d) If the first ball drawn is white, then the second is black. 

(e) Two balls of different color are drawn if and only if urn one is 
selected. 

6. In Example 1 give two logically true and two logically false statements 
(other than those in the text). 

7. In a college using grades A, B, C, D, and F, how many logically possible 

report cards are there for a student taking four courses? [Arts. 625.] 

8. A man has nine coins totaling 78 cents. What are the logical possi¬ 
bilities for the distribution of the coins? [Hint: There are three possibilities.] 

9. In Exercise 8, which of the following statements are logically true and 
which are logically false? 

(a) He has at least one penny. [Ans. Logically true.] 

(b) He has at least one nickel. [Ans. Neither.] 

(c) He has exactly two nickels. [Ans. Logically false.] 

(d) He has exactly three nickels if and only if he has exactly one dime. 

[Ans. Logically true.] 

10 . In Exercise 8, we are told that the man has no nickel in his possession. 
What can we infer from this? 

11. Two dice are rolled. Which of the following analyses satisfy the first 
condition for logical possibilities? What is wrong with the others? 

The sum of the numbers shown is: 

(a) : (1) 6, (2) not 6. 

(b) : (1) an even number, (2) less than 6, (3) greater than 6. 

(c) : (1) 2, (2) 3, (3) 4, (4) more than 4. 

(d) : (1) 7 or 11, (2) 2, 3, or 12, (3) 4, 5, 6, 8, 9, or 10. 

(e) : (1) 2, 4, or 6, (2) an odd number, (3) 10 or 12. 

(f) : (1) less than 5 or more than 8, (2) 5 or 6, (3) 7, (4) 8. 

(g) : (1) more than 5 and less than 10, (2) at most 4, (3) 7, (4) 11 or 12. 

[Ans. (a), (c), (d), (f) satisfy the condition.] 


6. TREE DIAGRAMS 

A very useful tool for analyzing logical possibilities is the drawing 
of a “tree.” This device will be illustrated by several examples. 

Example 1 . Consider again the example in Figure 20. Suppose 
we let the classification proceed as follows: first consider all human 
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beings before classification as being all in one group; next split this 
large group into three subgroups by putting the short people in one 
group, the medium height in another, and the tall people in the third; 
next split up each of these subgroups into four smaller subgroups 
(making a total of twelve in all) according to hair color; finally, split 
each of these subgroups into two parts by grouping males together 
and females together. The final classification then divides the group 
of all human beings into 24 subgroups. Figure 21 gives a graphical 


(End) 

MFMFMFMFMFMFMFMFMFMFMFM F 



All people 
(Start) 

Figure 21 


representation of the process described above. For obvious reasons 
we shall call a figure like this, which starts at a point and branches 
out, a tree. 

Observe that the tree contains all the information relevant for the 
classification problem. Each path through the tree from the start to 
the end (bottom to top) represents a logical possibility. There are 
24 in all, one for each end point of the tree, and similarly there are 24 
cases in Figure 20. The order in which we performed the classification 
is arbitrary, that is, we might equally well have first classified people 
according to hair color, then sex, and finally height. We would still 
get 24 logical possibilities but the tree that we would obtain would 
differ from that of Figure 21 (see Exercise 1). 

Example 2. Next let us consider the example of Figure 18. This 
is a three-stage process, first we select an urn, then draw a ball and 
then draw a second ball. The tree of logical possibilities is shown in 
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Figure 22. We note that six is the correct number of logical possibili¬ 
ties. The reason for this is: if we choose the first urn (which contains 
two black balls and one white ball) and draw from it a black ball, 
then the second draw may be of either color; however, if we draw a 
white ball first, then the second ball drawn is necessarily black. Simi¬ 
lar remarks apply if the second urn is chosen. 



Figure 22 

The student should observe that in the tree of Figure 21 each point 
on the same level has the same number of branches leading out of it, 
while in the tree of Figure 22 this is not the case. 


Example 3. As a final example let us construct the tree of logical 
possibilities for the outcomes of a World Series played between the 
Dodgers and the Yankees. In Figure 23 is shown half of the tree, 



Figure 23 
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corresponding to the case when the Dodgers win the first game (the 
dotted line at the bottom leads to the other half of the tree). In the 
figure a ‘D’ stands for a Dodger win and ‘Y’ for a Yankee win. There 
are 35 possible outcomes (corresponding to the circled letters) in the 
half-tree shown, so that the World Series can end in 70 ways. 

This example is different from the previous two in that the paths 
of the tree end at different levels, corresponding to the fact that the 
World Series ends whenever one of the teams has won four games. 

Not always do we wish as detailed an analysis as that provided in 
the examples above. If, in Example 2, we wanted to know only the 
color and order in which the balls were drawn and not which urn they 
came from, then there would be only four logical possibilities instead 
of six. Then in Figure 22 the second and fourth paths (counting from 
the left) represent the same outcome, namely, a black ball followed 
by a white ball. Similarly the third and fifth paths represent the same 
outcome. Finally, if we cared only about the color of the balls drawn,, 
not the order, then there are only three logical possibilities: two black 
balls, two white balls, or one black and one white ball. 

A less detailed analysis of the possibilities for the World Series is- 
also possible. For example we can analyze the possibilities as follows: 
Dodgers in 4, 5, 6, or 7 games; and Yankees in 4, 5, 6, or 7 games. 
The new classification reduced the number of possibilities from 70 to 8. 
The other possibilities have not been eliminated but merely grouped 
together. Thus the statement “Dodgers in 4 games” can happen in 
only one way, while “Dodgers in 7 games” can happen in 20 ways 
(see Figure 23). A still less detailed analysis would be a classification 
according to the number of games in the series. Here there are only 
four logical possibilities. 

The student will find that it often requires several trials before 
the “best” way of listing logical possibilities is found for a given prob¬ 
lem. 


EXERCISES 

1. Construct the tree for Example 1 if the order of classification is hair 
color, sex, and height. Do the same if the order of classification is sex, height, 
and hair color. Are there any other ways of performing this classification? 
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2. In 1955 the Dodgers lost the first two games of the World Series, but 

won the series in the end. In how many ways can the series go so that the 
losing team wins the first two games? [Ans. 10.] 

3. The following is a typical process in genetics: each parent has two 
genes for a given trait, AA or Aa or aa. The child will inh erit one gene from 
each parent. What are the possibilities for a child if both parents are AA? 
What if one is AA and the other aa? What if one is AA and the other Aa? 
What if both are Aa? Construct a tree for each process. [Let stage one be 
the choice of a gene from the first parent, stage two from the second parent. 
Then see how many different types the resulting branches represent.] 

4. It is often the case that types AA and Aa (see Exercise 3) are indis¬ 
tinguishable from the outside, but easily distinguishable from type aa. What 
are the logical possibilities if the two parents are of noticeably different types? 

5. A psychologist teaches a rat 

to run through a maze whose shape 
is shown in the diagram. Let us as¬ 
sume that, if a rat gets into a blind 
alley, it goes back to the last inter¬ 
section and tries a different passage 
than has previously ever been taken, 
but always in the direction of the 
arrows. How many possible paths are 
there? How many are there if a rat 
is stopped after making two wrong 
turns? [Ans. 20; 12.] 

6. We set up an experiment similar to that of Figure 18, but urn one 
has two black balls and two white balls, while urn two has one white ball and 
four black balls. We select an urn, and draw three balls from it. Construct 
the tree of logical possibilities. How many cases are there? [Ans. 10.] 

7. From the tree constructed in Exercise 6 answer the following questions. 

(a) In how many cases do we draw three black balls? 

(b) In how many cases do we draw two black balls and one white ball? 

(c) In how many cases do we draw three white balls? 

(d) How many cases does this leave? What cases are these? [Ans. 3.] 

8. In how many ways can the World Series be played (see Figure 23) 


if the Dodgers win the first game and 

(a) No team wins two games in a row. [Ans. 1.] 

(b) The Dodgers win at least the odd-numbered games. [Ans. 5.] 

(c) The winning team wins four games in a row. [Ans. 4.] 

(d) The losing team wins four games. [Ans. 0.] 





30 


COMPOUND STATEMENTS 


[Chap. I 


9. A man is considering the purchase of one of four types of stocks. 
Each stock may go up, go down, or stay the same after his purchase. Draw 
the tree of logical possibilities. 

10. For the tree constructed in Exercise 9 give a statement which: 

(a) Is true in half the cases. 

(b) Is false in all but one case. 

(c) Is true in all but one case. 

(d) Is logically true. 

(e) Is logically false. 

11 . In Exercise 6 we wish to make a rougher classification of logical 
possibilities. What branches (in the tree there constructed) are identified if: 

(a) We do not care about the order in which the balls are drawn. 

(b) We care neither about the order of balls, nor about the number 
of the urn selected. 

(c) We care only about what urn is selected, and whether the balls 
drawn are all the same color. 

12. Work Exercise 7 of the last section, by sketching a tree diagram. 

13. A menu has a choice of soup or orange juice for an appetizer, a choice 
of steak, chicken, or fish for the entree, and a choice of pie or cake for dessert. 
A complete dinner consists of one choice in each case. Draw the tree for the 
possible complete dinners. 

(a) How many different complete dinners are possible? [. Ans . 12.] 

(b) How many complete dinners are there which have chicken for 

the entree? [Ans. 4.] 

(c) How many complete dinners are there available for a man who 

will eat pie only if he had steak for the entree? [Ans. 8.] 

7. LOGICAL RELATIONS 

Until now we have considered statements in isolation. Sometimes, 
however, we want to consider the relationship between pairs of state¬ 
ments. The most interesting such relation is that one statement 
(logically) im/plies the other one. If p implies q we also say that q 
follows from p, or that q is (logically) deducible from p. For example, 
in any mathematical theorem the hypothesis implies the conclusion. 

If we have listed all logical possibilities for a pair of statements 
p and q , then we shall characterize implication as follows: p implies q 
if q is true whenever p is true, i.e., if q is true in all the logically possible 
cases in which p is true. 

For compound statements having the same components truth ta- 
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bles provide a convenient method for testing this relation. In Figure 
24 we illustrate this method. Let us take p <-> q as our hypothesis. 
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Figure 24 


Since it is true only in the first and fourth cases and p —> q is true 
in both these cases we see that the statement p <-» q implies p —» q. On 
the other hand the statement p V q is false in the fourth case and 
hence it is not implied by p <->• q. Again, a comparison of the last two 
columns of Figure 24 shows that the statement p —» q does not imply 
and is not implied by p V q. 

The relation of implication has a close affinity to the conditional 
statement, but it is important not to confuse the two. The conditional 
is a new statement compounded from two given statements, while 
implication is a relation between the two statements. The connection 
is the following: p implies q if and only if the conditional p —» q is 
logically true. 

That this is the case is shown by a simple argument. The statement 
p implies the statement q if q is true whenever p is true. This means 
that there is no case in which p is true and q false, i.e., no case in which 
V —> q is false. But this in turn means that p —> q is logically true. 
In Exercise 1 this result will be applied to Figure 24. 

Let us now take up the “paradoxes 77 of the conditional. Conditional 
statements sound paradoxical when the components are not related. 
For example, it sounds strange to say that “If it is a nice day then 
chalk is made of wood/ 7 is true on a rainy day. It must be remem¬ 
bered that the conditional statement just quoted means no more and 
no less than that one of the following holds: (1) It is a nice day and 
chalk is made of wood, or (2) It is not a nice day and chalk is made of 
wood, or (3) It is not a nice day and chalk is not made of wood. [See 
Figure lib.] And on a rainy day number 3 happens to be correct. 

But it is by no means true that “It is a nice day, 77 implies that 
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“Chalk is made of wood.” It is logically possible for the former to 
be true and for the latter to be false (indeed, this is the case on a nice 
day, with the usual method of chalk manufacture), hence the imphca- 
tion does not hold. Thus while the conditional quoted in the previous 
paragraph is true on a given day, it is not logically true. 

In common parlance “if . . . then . . .” is usually asserted on 
logical grounds. Hence any usage in which such an assertion happens 
to be true, but is not logically true, sounds paradoxical. Similar re¬ 
marks apply to the common usage of “if and only if.” 

If the biconditional p q is not only true but logically true, then 
this establishes a relation between p and q. Since p <-> q is true in 
every logically possible case, the statements p and q have the same 
truth value in every case. We say, under these circumstances, that 
p and q are (logically) equivalent. For compound statements having 
the same components, the truth table provides a convenient means 
of testing for equivalence. We merely have to verify that the com¬ 
pounds have the same truth table. Figure 25 establishes that p > q 
is equivalent to ~p V q- 
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EXERCISES 

1 . Show that (p <-*• q) -» (p -* q ) is logically true, but that (p «-»• q) -* 
f(p V q) is not logically true. 

2. Prove that p is equivalent to q just in case p implies q and q implies p. 

3. Construct truth tables for the following compounds, and test for im¬ 
plications and equivalences. 

(a) p A q. 

(b) p -> ~q. 

(c) ~p V ~q. 

(d) ~p Vg. 

(e) p A ~q. IAns. (b) equiv. (c); (a) impl. (d); (e) impl. (b), (c).j 
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4. Construct truth tables for the following compounds, and arrange them 
in order so that each compound implies all the following ones. 

(a) ~p <-> q . 

(b) p —> (~p —> q ). 

(c) ~[p —► (q —> p)]. 

(d) p V g. 

(e) A 2- WfM. (0), (e), (a), (d), (b).] 

5. Construct a compound equivalent top A q, using only the connectives 
^ and V. 

6. Construct a compound equivalent to p q using only the connectives 
—> and A. (Cf. Exercise 2.) 

7. Construct a compound statement equivalent to p V q, using only 

the connectives ~ and A. 

8. If p is logically true, prove that: 

(a ) p V q is logically true. 

(b) ~p A q is logically false. 

(c) p A q is equivalent to q. 

(d) ~ p v q is equivalent to q. 

9. If p and q are logically true and r is logically false, what is the status of 

(p V ~q) A ~r? [ Ans . Logically true.] 

10. Prove that the conjunction or disjunction of a statement with itself 
is equivalent to the statement. 

11. Prove that the double negation of a statement is equivalent to the 
statement. 


12. Prove that a statement which implies its own negation is a self- 
contradiction. 

13. What is the status of a statement equivalent to its own negation? 

14. What relation exists between two logically true statements? Between 
two self-contradictions? 

15. Prove that a logically true statement is implied by every statement, 
and that a self-contradiction implies every statement. 

16. Using the results of Section 4, Exercises 10 and 11, prove that for any 
compound statement there is an equivalent compound statement: 

(a) Whose only connective is |. 

(b) Whose only connective is |. 
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*8. A SYSTEMATIC ANALYSIS OF LOGICAL RELATIONS 

The relation of implication is characterized by the fact that it is 
impossible for the hypothesis to be true and the conclusion to be false. 
If two statements are equivalent, it is impossible for one to be true 
and the other to be false. Thus we see that for an implication one 
truth table case must not occur, and for an equivalence two of the 
four truth table cases must not occur. The absence of one or more 
truth table cases is thus characteristic of logical relations. In this 
section we shall investigate all conceivable relations that can exist 
between two statements. 

We shall say that two statements are unrelated if each of the four 
truth table cases (see Figure 26) can occur. The two statements are 
related if one or more of the four cases in Figure 26 cannot occur. 
[Cf. Section 5.] 

If p and q are statements such 
that exactly one of the cases in 
Figure 26 is excluded, then we 
say that there is a onefold relation 
between them. Obviously there 
are four possible onefold relations 
which we list below, (a) If case 1 
is excluded, the two statements 
cannot both be true. In this case 
p and q are said to be a pair of con¬ 
traries or are said to be inconsistent . (b) If case 2 is excluded, then 
(cf. Section 7) p implies q. (c) If case 3 is excluded, it is false that q 
is true and p is false, that is, q implies p. (d) If case 4 is excluded, 
both statements cannot be false, i.e., one of them is true. Such a 
pair of statements is called a pair of subcontraries . 

If p and q are statements such that exactly two of the cases in 
Figure 26 are excluded, then we say that there is a twofold relation 
between them. There are six ways in which two cases can be selected 
from four, but several of these do not produce interesting relations. 
For example, suppose cases 1 and 2 are excluded; then p cannot be 
true, i.e., it is logically false. Similarly, if cases 1 and 3 are excluded, 
then q is logically false. On the other hand, if cases 3 and 4 are ex¬ 
cluded, then p is logically true; and if 2 and 4 are excluded, then q 
is logically true. Hence we see that these choices do not give us new 
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relations; they merely indicate that one of the two statements is 
logically true or false. We now have only two alternatives remaining: 
(A) cases 2 and 3 are excluded which means that the two statements 
are equivalent; and (B) cases 1 and 4 are excluded, which means that 
the two statements cannot both be true and cannot both be false; 
in other words, one must be true and the other false. We shall then 
say that p and q are contradictories . 

It is not hard to see that there are no threefold relations, for if 
three of the cases in Figure 26 are excluded, then there is only one 
possibility for each of the two statements, so that each must be either 
logically true or logically false. 

We have already discussed implication and equivalence and have 
noted their connection to the conditional and the biconditional, re¬ 
spectively. We can do the same for the three remaining relations. 
If p and q are subcontraries, then they cannot both be false) since 
this is the only case in which their disjunction is false, we see that p 
and q are subcontraries if and only if p V q is logically true. If p and q 
are contraries, then they cannot both be true; since this is the only 
case in which their conjunction is true, we see that p and q are con¬ 
traries if and only if p A q is logically false. Finally, if p and q are 
contradictories, then cases 1 and 4 of Figure 26 are excluded, hence 
V Q ^ logically false. (Note also that, if p and q are contradictories, 
then p y q is logically true.) The table in Figure 27 gives a summary 
of the relevant facts about the six relations we have derived. 


Case(s) Excluded 

Relation 

Alternate Definition 

T-T 

F-F 

T-F 

F-T 

T-F and F-T 
T-T and F-F 

Contraries 

Subcontraries 

First implies second 
Second implies first 
Equivalents 
Contradictories 

p A q logically false 

V V q logically true 

V —» q logically true 
q —> p logically true 

P q logically true 
p<r+ q logically false 


Figure 27 


Subcontraries are not of great theoretical importance, but con¬ 
traries and contradictories are very important. Each of these rela¬ 
tions can be generalized to hold for more than two statements. If we 
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have n different statements, not all of which can be true, then we say 
that they are inconsistent Then the conjunction of these statements 
must be false. Special cases of inconsistent statements are the follow¬ 
ing: if n = 1, then we have a single self-contradictory statement; and 
if n — 2, then we have a pair of inconsistent statements (i.e., a pair of 
contraries). 

If we have n different statements such that one and only one of 
them can be true, then we say they form a complete set of alternatives . 
Again the special cases are: if n = 1, then we have a single logically 
true statement; and if n = 2, then we have a pair of contradictories. 

Truth tables again furnish a method for recognizing when relations 
hold between statements. The examples below show how the method 
works. 

Examples. Consider the five compound statements, all having the 
same components, which appear in Figure 28. Find all relations which 
exist between pairs of these statements. 
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First of all we note that statements 3 and 5 have identical truth 
tables, hence they are equivalent. Therefore we need consider only 
one of them, say statement 3. Statements 1 and 2 have exactly op¬ 
posite truth tables, hence they are contradictories. Upon comparing 
statements 1 and 3 we find no T-F case, so that 1 implies 3. Since 
numbers 1 and 4 are never both true they are contraries, while num¬ 
bers 2 and 3 are never both false, so that they are subcontraries. 
Finally, upon comparing either 2 or 3 to 4 we find no F-T case and 
hence both are implied by 4. Thus the six relations we found above 
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are all exemplified in Figure 28. Observe also that statements p and q 
give an example of a pair of unrelated statements. [Cf. Section 5.] 

EXERCISES 

1. Construct truth tables for the following four statements and state 
what relation (if any) holds between each of the six pairs formable. 

(a) ~p. 

(b) ~q. 

(c) p A ~q. 

(d) V q). 

[Ans. (a) and (b) unrelated; (a) and (c), (d) contraries; (c), (d) 
imply (b); (c) equiv. (d)J 

2. Construct truth tables for each of the following six statements. Give 
an example of an unrelated pair, and an example of each of the six possible 
relations among these. 

(a) p^q. 

(b) p g. 

(c) A ~q. 

(d) (p A q) V (~p A ~q). 

(e) ~q. 

(f) p A ~q. 

3. Prove the following assertions: 

(a) The disjunction of two contradictory statements is logically true. 

(b) Two statements are equivalent if and only if either one implies 
the other one. 

(c) The contradictories of two contraries are subcontraries. 

4. What is the relation between the following pair of statements? 

(a) p —» [p A ~(q V r)]. 

(b) ~p V (~q A ~r). [Am. Equivalent.] 

5. At most how many of the following assertions can one person con¬ 
sistently believe? 

(a) Joe is smart. 

(b) Joe is unlucky. 

(c) Joe is lucky but not smart. 

(d) If Joe is smart, then he is unlucky. 

(e) Joe is smart if and only if he is lucky. 

(f) Either Joe is smart, or he is lucky, but not both. [Ans. 4.] 

6. Prove the following assertions. 

(a) The contradictories of two equivalent statements are equivalent. 
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(b) In a complete set of alternatives any two statements are contraries. 

(c) If p and q are subcontraries, and if each implies r, then r is logically 
true. 

7. Pick out a complete set of (four) alternatives from: 

(a) It is raining but the wind is not blowing. 

(b) It rains if and only if the wind blows. 

(c) It is not the case that it rains and the wind blows. 

(d) It is raining and the wind is blowing. 

(e) It is neither raining nor is the wind blowing. 

(f) It is not the case that it is raining or the wind is not blowing. 

[Ans. (a), (d), (e), (f).] 

8. What is the relation between [p V ~(q V r) V (p A «)] and 

~(p A q A r A s)? [Ans. Subcontraries.] 

9. Suppose that p and q are contraries. What is the relation between 

(a) p and ~q. 

(b) ^ p and q. 

(c) ~p and ~q. 

(d) p and ~p. 

10. Let p , q, and r be three statements such that any two of them are 
unrelated. Discuss the possible relations among the three statements. [Hint: 
If we ignore the order of the statements, there are 14 such relations. The 
relations are at most fourfold. There are two fourfold relations, and the other 
relations are found from these by excluding fewer cases.] 

11 . In Section 5 we considered an example comparing the height of two 
men. Suppose that we allow for the possibilities: below 5 ft 9 in., 5 ft 9 in., 
5 ft 10 in., 5 ft 11 in., 6 ft 0 in., above 6 ft. We will, for the purpose of this 
problem, consider two men of the same height if they fall into the same 
category according to the above analysis. 

(a) Construct the set of all possibilities for a pair of men, Jim and Bill. 

(b) Find the cases in which “Jim is taller than Bill” is true. 

(c) Find the cases where “Bill is taller than Jim” is true. 

(d) Are all four truth table cases present? 

(e) What is the relation between the two statements? 

12 . Construct the set of logical possibilities which classify a person with 
respect to sex and marital status. 

(a) Show that “if the person is a bachelor then he is unmarried” is 
logically true. 

(b) Show that “if a person is an old maid then the person is a man” 
is not logically false. 
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(c) Find the relation between “the person is a man” and “the person 
is a bachelor.” 

(d) Find a simple statement that is a subcontrary of “the person is a 
man,” and is consistent with it. 

9. VARIANTS OF THE CONDITIONAL 

The conditional of two statements differs from the biconditional 
and from disjunctions and conjunctions of these two in that it lacks 
symmetry. Thus p V q is equivalent to q V p, p A q is equivalent 
to q A p, and p <-* q is equivalent to q V] but p -* q is not equivalent 
to q—> p. The latter statement, q-*p, is called the converse oip-±q. 
Many of the most common fallacies in thinking arise from a confusion 
of a statement with its converse. 

It is also of interest to consider conditionals formed from the state¬ 
ments p and q. The truth tables of these four conditionals together 
with their names are tabulated in Figure 29 . We note that p —> q is 
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Conditional 
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positive 
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Figure 29 


equivalent to ~q —► The latter is called the contrapositive of the 

former. For many arguments the contrapositive is a very useful form 
of the conditional. In the same manner the statement ~p —> is 
the converse of the contrapositive. Since the contrapositive is equiva¬ 
lent to p —» q> the converse of the former is equivalent to the converse 
of the latter as can be seen in Figure 29 . 

The use of conditionals seems to cause more trouble than the use 
of the other connectives, perhaps because of the lack of symmetry, 
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but also perhaps because there are so many different ways of express¬ 
ing conditionals. In many cases only a careful analysis of a condi¬ 
tional statement shows whether the person making the assertion 
means the given conditional or its converse. Indeed, sometimes he 
means both of these, i.e., he means the biconditional. (See Exercise 5.) 

The statement, “I will go for a walk only if the sun shines/ 1 is a 
variant of a conditional statement. A statement of the form “p only 
if q” is closely related to the statement “If p then q,” but just how? 
Actually the two express the same idea. The statement “p only if q” 
.states that “If ~q then ~p” and hence is equivalent to “If p then q” 
Thus the statement at the beginning of the paragraph is equivalent 
to the statement, “If I go for a walk, then the sun will be shining. 7 ’ 

Other phrases, in common use by mathematicians, which indicate 
; a conditional statement are, “a necessary condition” and “a sufficient 
condition.” To say that p is a sufficient condition for q means that 
if p takes place, then q will also take place. Hence the sentence “p is 
a sufficient condition for q ” is equivalent to the sentence “If p then g.” 

Similarly, the sentence “p is a necessary condition for q” is equiva¬ 
lent to “q only if p.” Since we know that the latter is equivalent to 
“If q then p,” it follows that the assertion of a necessary condition 
is the converse of the assertion of a sufficient condition. 

Finally, if both a conditional statement and its converse are as¬ 
serted, then effectively the biconditional statement is being asserted. 
Hence the assertion “p is a necessary and sufficient condition for q ” 
is equivalent to the assertion “p if and only if q.” 

EXERCISES 

1. Let p stand for “I will pass this course” and q for “I will do homework 
regularly.” Put the following statements into symbolic form. 

(a) I will pass the course only if I do homework regularly. 

(b) Doing homework regularly is a necessary condition for me to pass 
this course. 

(c) Passing this course is a sufficient condition for me to do homework 
regularly. 

(d) I will pass this course if and only if I do homework regularly. 

(e) Doing homework regularly is a necessary and sufficient condition 
for me to pass this course. 

2. Take the statement in part (a) of the previous exercise. Form its 
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-converse, its contrapositive, and the converse of the contrapositive. For each 
of these give both a verbal and a symbolic form. 

3. Let p stand for “It snows/' and q for “The train is late." Put the 
following statements into symbolic form. 

(a) Snowing is a sufficient condition for the train to be late. 

(b) Snowing is a necessary and sufficient condition for the train to be 
late. 

(c) The train is late only if it snows. 

4. Take the statement in part (a) of the previous exercise. Form its con- 
verse, its contrapositive, and the converse of its contrapositive. Give a 
verbal form of each of them. 

5. Prove that the conjunction of a conditional and its converse is equiva¬ 
lent to the biconditional. 

6. To what is the conjunction of the contrapositive and its converse 
equivalent? Prove it. 

7. Prove that 

(a) i s equivalent to p. 

(b) The contrapositive of the contrapositive is equivalent to the 
original conditional. 

8. “For a matrix to have an inverse it is necessary that its determinant 
be different from zero." Which of the following statements follow from this? 
[No knowledge of matrices is required.] 

(a) For a matrix to have an inverse it is sufficient that its determinant 
be zero. 

(b) For its determinant to be different from zero it is sufficient for the 
matrix to have an inverse. 

(c) For its determinant to be zero it is necessary that the matrix have 
no inverse. 

(d) A matrix has an inverse if and only if its determinant is not zero. 

(e) A matrix has a zero determinant only if it has no inverse. 

[Ans. (b), (c), (e)J 

9. A function that is differentiable is continuous." This statement is 
true for all functions, but its converse is not always true. Which of the 
following statements are true for all functions? [No knowledge of functions 
is required.] 

(a) A function is differentiable only if it is continuous. 

(b) A function is continuous only if it is differentiable. 

(c) Being differentiable is a necessary condition for a function to be 
continuous. 
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(d) Being differentiable is a sufficient condition for a function to be 
continuous. 

(e) Being differentiable is a necessary and sufficient condition for a 

function to be differentiable. [Ans. (a), (d), (e).] 

10. Prove that the negation of, “p is a necessary and sufficient condition 
for q,” is equivalent to, “p is a necessary and sufficient condition for ~q” 

10. VALID ARGUMENTS 

One of the most important tasks of a logician is the checking of 
arguments . By an argument we shall mean the assertion that a certain 
statement (the conclusion) follows from other statements (the prem¬ 
ises). An argument will be said to be valid if and only if the conjunc¬ 
tion of the premises implies the conclusion, i.e., whenever the premises 
are all true, the conclusion is also true. 

It is important to realize that the truth of the conclusion is ir¬ 
relevant as far as the test of the validity of the argument goes. A true 
conclusion is neither necessary nor sufficient for the validity of the 
argument. The two examples below show this, and they also show the 
form in which we shall state arguments, i.e., first we state the prem¬ 
ises, then draw a line and then state the conclusion. 

Example I. 

If the United States is a democracy, then its 
citizens have the right to vote. 

Its citizens do have the right to vote. _ 

Therefore the United States is a democracy. 

The conclusion is, of course, true. However, the argument is not 
valid since the conclusion does not follow from the two premises. 

Example 2. 

In a democracy the chief executive is elected directly 
by the people. 

In England the Prime Minister is the chief executive. 

The British Prime Minister is not directly elected. 

Therefore England is not a democracy. 

Here the conclusion is false, but the argument is valid since the 
conclusion follows from the premises. If we observe that the first 
premise is false, the paradox disappears. There is nothing surprising 
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in the correct derivation of a false conclusion from false premises. 

If an argument is valid, then the conjunction of the premises im¬ 
plies the conclusion. Hence if all the premises are true, then the 
conclusion is also true. However, if one or more of the premises is 
false, so that the conjunction of all the premises is false, then the 
conclusion may be either true or false. In fact all the premises could 
be false, the conclusion true, and the argument valid, as the following 
example shows. 

Example 3. 

All dogs have two legs. 

All two-legged animals are carnivorous. 

Therefore, all dogs are carnivorous. 

Here the argument is valid and the conclusion is true, but both 
premises are false! 

Each of these examples underlines the fact that neither the truth 
value nor the content of the statements appearing in an argument 
affect the validity of the argument. In Figures 30 and 31 are two 
valid forms of arguments: 

p-^q p-*q 

p 

.\ q ~p 

Figure 30 Figure 31 

The symbol .*• means “therefore.” The truth tables for these argu¬ 
ment forms appear in Figure 32. 
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Figure 32 


For the argument of Figure 30, we see in Figure 32 that there is only 
one case in which both premises are true, namely, the first case, and 
in this case the conclusion is true, hence the argument is valid. Simi¬ 
larly, in the argument of Figure 31, both premises are true in the 
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fourth case only, and in this case the conclusion is also true, hence 
the argument is valid. 

An argument that is not valid is called a fallacy. Two examples of 
fallacies are the following argument forms: 

p -» q V-*<1 

q Fallacies 

p 

In the first fallacy both premises are true in the first and third cases 
of Figure 32, but the conclusion is false in the third case, so that the 
argument is invalid. (This is the form of Example 1.) Similarly, in 
the second fallacy we see that both premises are true in the last two 
cases, but the conclusion is false in the third case. 

We say that an argument depends only upon its form in that it 
does not matter what the components of the argument are. The truth 
tables in Figure 32 show that if both premises are true, then the con¬ 
clusions of the arguments in Figures 30 and 31 are also true. For the 
fallacies above, the truth tables show that it is possible to choose both 
premises true without making the conclusion true, namely, choose a 
false p and a true q. 

Example 4. Consider the following argument: 

p-+q 
q —> r 

.*. p —> r 

The truth table of the argument appears in Figure 33. 
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Both premises are true in the first, fifth, seventh, and eighth rows 
of the truth table. Since in each of these cases the conclusion is also 
true, the argument is valid. (Example 3 can be written in this form.) 


EXERCISES 


1. Test the validity of the following arguments: 

(a) p <-> q (b) p V q 

V 


(c) 


P Aq 
u p^q 


[Arts, (a), (b) are valid.] 


2. Test the validity of the following arguments: 


(a) 


P- 


(b) 


3. Test the validity of the argument 

p^q 

q V r 


4. Test the validity of the argument 

p\L q 


r^q —> r 
r^jp V r*ur 


r>*/f —> r^j'p 

[Ans. (b) is valid.] 


[Ans. Not valid.] 


5. Test the validity of the argument 

p-* q 

r^jp —» r^/q 

p A 


6. Given are the premises ~p —► q and —> ~q. We wish to find a 

valid conclusion involving p and r (if there is any). 

(a) Construct truth tables for the two premises. 

(b) Note the cases in which the conclusion must be true. 

(c) Construct a truth table for a combination of p and r only, filling 
in T wherever necessary. 
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(d) Fill in the remainder of the truth table, making sure that you do 
not end up with a logically true statement. 

(e) What combination of p and r has this truth table? This is a valid 

conclusion. [Ans. p V r.] 

7. Translate the following argument into symbolic form, and test its 
validity. 

If this is a good course, then it is worth taking. 

Either the grading is lenient, or the course is 
not worth taking. 

But the grading is not lenient. _ 

Therefore, this is not a good course. [ Ans . Valid.] 

8. Write the following argument in symbolic form, and test its validity: 

“For the candidate to win it is sufficient that he carry New York. 

He will carry New York only if he takes a strong stand on civil 
rights. He will not take a strong stand on civil rights. Therefore, 
he will not win.” 

9. Write the following argument in symbolic form and test its validity: 

“Father praises me only if I can be proud of myself. Either I do 
well in sports or I cannot be proud of myself. If I study hard, 
then I cannot do well in sports. Therefore, if father praises me, 
then I do not study hard.” 

10. Supply a conclusion to the following argument, making it a valid 
argument. [Adapted from Lewis Carroll.] 

“If he goes to a party, he does not fail to brush his hair. 

To look fascinating it is necessary to be tidy. 

If he is an opium eater, then he has no self-command. 

If he brushes his hair, he looks fascinating. 

He wears white kid gloves only if he goes to a party. 

Having no self-command is sufficient to make one look untidy. 
Therefore . . .” 

*11. THE INDIRECT METHOD OF PROOF 

A proof is an argument which shows that a conditional statement 
of the form p > q is logically true. (Namely, p is the conjunction 
of the premises, and q is the conclusion of the argument.) Sometimes 
it is more convenient to show that an equivalent conditional state¬ 
ment is logically true. 
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Example 1 . Let x and y be positive integers. 

Theorem. If xy is an odd number, then x and y are both odd. 

Proof. Suppose, on the contrary, that they are not both odd. Then 
one of them is even, say x — 2z. Then xy = 2 zy is an even number, 
contrary to hypothesis. Hence we have proved our theorem. 

Example 2. “He did not know the first name of the president of 
the Jones Corporation, hence he cannot be an employee of that firm. 
Why? Because every employee of that firm calls the boss by his first 
name (behind his back). Therefore, if he were really an employee of 
Jones, then he would know Jones’s first name.” 


These are simple examples of a very common form of argument, 
frequently used both in mathematics and in everyday discussions. 
Let us try to unravel the form of the argument. 


Given: 

xy is an odd number. 

He doesn’t know Jones’s 
first name. 

7> 

To prove: 

x and y are both odd num¬ 
bers. 

He doesn’t work for Jones. 

Q 

Suppose: 

x and y are not both odd 
numbers. 

He does work for Jones. 

~q 

Then: 

xy is an even number. 

He must know what Jones’s 
first name is. 



In each case we assume the contradictory to the conclusion and derive, 
by a valid argument, a result contradictory to the hypothesis. This 
is one form of the indirect method of proof. 

To restate, what we want to do is to show that the conditional 
V —» q is logically true; what we actually show is that the contraposi¬ 
tive ~q —> is logically true. Since these two statements are 

equivalent our procedure is valid. 

There are several other important variants of this method of proof. 
It is easy to check that the following statements have the same truth 
table as (that is, are equivalent to) the conditional p —» q: 

(1) (p A ~q) 

(2) (p A ~q) —> q, 

(3) (p A ~q) —» (r A ~r). 

The first of these shows that in the indirect method of proof we may 
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make use of the original hypothesis in addition to the contradictory 
assumption ~q. The second shows that we may also use this double 
hypothesis in the direct proof of the conclusion q. The third shows 
that if, from the double hypothesis p and ~q we can arrive at a 
contradiction of the form r A ~r, then the proof of the original 
statement is complete. This last form of the method is often referred 
to as reductio ad absurdum . 

These last forms of the method are very useful for the following 
reasons: First of all we see that we can always take ~q as a hypothesis 
in addition to p. Secondly we see that besides q there are two other 
conclusions (~p or a contradiction) which are just as good. 

EXERCISES 

1. Construct indirect proofs for the following assertions: 

(a) If x 2 is odd, then x is odd (x an integer). 

(b) If I am to pass this course, I must do homework regularly. 

(c) If he earns a great deal of money (more than $20,0C0), he is not a 
college professor. 

2. Give a symbolic analysis of the following argument: 

“If he is to succeed, he must be both competent and lucky. Be¬ 
cause, if he is not competent, then it is impossible for him to 
succeed. If he is not lucky, something is sure to go wrong.” 

3. Construct indirect proofs for the following assertions: 

(a) If p V q and ~q, then p . 

(b) If p <-» q and q —» ~r and r, then ~p. 

4. Give a symbolic analysis of the following argument: 

“If Jones is the murderer, then he knows the exact time of death 
and the murder weapon. Therefore, if he does not know the exact 
time or does not know the weapon, then he is not the murderer.” 

5. Verify that forms (1), (2), and (3) given above are equivalent to p q. 

6. Give an example of an indirect proof of some statement in which from 
p and ~q a contradiction is derived. 

7. Give a statement equivalent to {p A q) —> r, which is in terms of 
^p, q , and ~r. Show how this can be used in a proof where there are two 
hypotheses given. 
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*12, APPLICATIONS TO SWITCHING CIRCUITS 

The theory of compound statements has many applications to sub¬ 
jects other than pure mathematics. As an example we shall develop 
a theory of simple switching networks. 

A switching network is an arrangement of wires and switches which 
connect together two terminals Ti and T 2 . Each switch can be either 
“open” or “closed.” An open switch prevents the flow of current, 
while a closed switch permits flow. The problem that we want to 
solve is the following: given a network and given the knowledge of 
which switches are closed, determine whether or not current will flow 
from Ti to T 2 . 


Ti*--P — •Tfe T x » P-Q . T 2 ' 

Figure 34 Figure 35 

Figure 34 shows the simplest kind of a network in which the termi¬ 
nals are connected by a single wire containing a switch P. If P is 
closed, then current will flow between the terminals, and otherwise 
it does not. The network in Figure 35 has two switches P and Q in 
“series.” Here the current flows only if both P and Q are closed. 

To see how our logical analysis can be used to solve the problem 
stated above let us associate a statement w'ith each switch. Let p be 
the statement “Switch P is closed” and let q be the statement “Switch 
Q is closed.” Then in Figure 34 current will flow if and only if p is 
true. Similarly in Figure 35 the current will flow if and only if both 
p and q are true, that is, if and only if p A q is true. Thus the first 
circuit is represented by p and the second by p A q. 



Figure 36 Figure 37 


In Figure 36 is shown a network with switches P and Q in “paral¬ 
lel.” In this case the current flows if either of the switches is closed, 
so the circuit is represented by the statement p V q. 

The network in Figure 37 combines the series and parallel types 
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of connections. The upper branch of the network is represented by 
the statement p A q and the lower by r A s; hence the entire circuit 
is represented by (p A g) V (r A s). Since there are four switches 
and each one can be either open or closed, there are 2 4 = 16 possible 
settings for these switches. Similarly, the statement (p A q) V (r A s) 
has four variables, so that its truth table has 16 rows in it. The switch 
settings for which current flows correspond to the entries in the truth 
table for which the above compound statement is true. 

Switches need not always act independently of each other. It is 
possible to couple two or more switches together so that they open 
and close simultaneously, and we shall indicate this in diagrams by 
giving all such switches the same letter. It is also possible to couple 
two switches together, so that if one is closed, the other is open. We 
shall indicate this by giving the first switch the letter P and the second 
the letter P'. Then the statement “P is closed” is true if and only if 
the statement “P' is closed” is false. Therefore if p is the statement 
“P is closed,” then ~p is the statement “P' is closed.” 

Such a circuit is illustrated in 
Figure 38. The associated com¬ 
pound statement is [p V (~p A 
^g)] V [p A g]. Since this state¬ 
ment is false only if p is false 
and q is true, the current will flow 
unless P is open and Q is closed. 


TV 



•T 2 


Figure 38 


We can also check directly. If P is closed, current will flow through 
the top branch regardless of Q’s setting. If both switches are open, 
then P' and Q' will be closed, so that current will flow through the 
middle branch. But if P is open and Q is closed, none of the 
branches will pass current. 

Notice that we never had to consider current flow through the 
bottom branch. The logical counterpart of this fact is that the state¬ 
ment associated with the network is equivalent to [p V (~p A ~q)] 
whose associated network is just the upper two branches of Figure 
38. Thus the electrical properties of the circuit of Figure 38 would 


be the same if the lower branch were omitted. 

As a last problem we shall consider the design of a switching net¬ 
work having certain specified properties. An equivalent problem, 
which we solved in Section 4, is that of constructing a compound 
statement having a given truth table. As in that section, we shall 
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limit ourselves to statements having three variables, although our 
methods could easily be extended. 

In Section 4 we developed a general method for finding a statement 
having a given truth table not consisting entirely of F's. (The circuit 
which corresponds to a statement whose truth table consists entirely 
of Fs is one in which current never flows, and hence is not of in¬ 
terest.) Each such statement could be constructed as a disjunction 
of basic conjunctions. Since the basic conjunctions were of the form 
pAqAr, pAqA ~r, etc., each will be represented by a circuit 
consisting of three switches in series and will be called a basic series 
circuit. The disjunction of certain of these basic conjunctions will 
then be represented by the circuit obtained by putting several basic 
series circuits in parallel. The resulting network will not, in general, 
be the simplest possible such network fulfilling the requirements, but 
the method always suffices to find one. 

Example. A three-man committee wishes to employ an electric 
circuit to record a secret simple majority vote. Design a circuit so 
that each member can push a button for his “yes” vote (not push it 
for a “no” vote), and so that a signal light will go on if a majority 
of the committee members vote yes. 


V 

q 
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Truth 
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^p A ~^q A ~r 


Figure 39 


Let p be the statement “committee member 1 votes yes,” let q be 
the statement “member 2 votes yes,” and let r be “member 3 votes 
yes.” The truth table of the statement “majority of the members 
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vote yes” appears in Figure 39. From that figure we can read off the 
desired compound statement as 

(p A q A r) V (p A q A ~r) V (p A A r) V (~p A q A r). 
The circuit desired for the voting procedure appears in Figure 40. 

Voting buttons 



Voltage source 

Figure 40 

EXERCISES 

1. What kind of a circuit has a logically true statement assigned to it? 
Give an example. 

2. Construct a network corresponding to 

[( p A ~q) V (~p A #)] V (~p A ~q)- 

3. What compound statement represents: 

P-*T 2 


4. Work out the truth table of the statement in Exercise 3. What does 
this tell us about the circuit? 

5. Design a simpler circuit than the one in Exercise 3, having the same 
properties. 

6. Construct a network corresponding to 

[(p V q) A ~r] V [('—'P A r) V g]. 
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7. Design a circuit for an electrical version of the game of matching 
pennies: At a given signal each of the two players either opens or closes a 
switch under his control. If they both do the same, A wins; if they do the 
opposite, then B wins. Design the circuit so that a light goes on if A wins 


8. In a large hall it is desired to turn the lights on or off from any one of 
four switches on the four walls. This can be accomplished by designing a 
cncmt which turns the light on if an even number of switches are closed, and 
off if an odd number are closed. (Why does this solve the problem?) Design 
such a circuit. B 


9. A committee has five members. It takes a majority vote to carry a 
measure, except that the chairman has a veto (i.e., the measure carries only 
if he votes for it). Design a circuit for the committee, so that each member 
votes for a measure by pressing a button, and the light goes on if and only 
it the measure is carried. 

10. A group of candidates is asked to take a true-false exam, with four 
questions. Design a circuit such that a candidate can push the buttons of 
those questions to which he wants to answer “true,” and that the circuit 
will indicate the number of correct answers. [Hint: Have five lights, corre- 
spondmg to 0, 1, 2, 3, 4 correct answers, respectively.] 

11. Devise a scheme for working truth tables by means of switching 
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Chapter I 


SETS AND SUBSETS 


1. INTRODUCTION 

A well-defined collection of objects is known as a set . This concept, 
in its complete generality, is of great importance in mathematics since 
all of mathematics can be developed by starting from it. 

The various pieces of furniture in a given room form a set. So do 
the books in a given library, or the integers between 1 and 1,000,000, 
or all the ideas that mankind has had, or the human beings alive 
between one billion b.c. and ten billion a.d. These examples are all 
examples of finite sets, that is, sets having a finite number of elements. 
All the sets discussed in this book will be finite sets. 

There are two essentially different ways of specifying a set. One 
can give a rule by which it can be determined whether or not a given 
object is a member of the set, or one can give a complete list of the 
elements in the set. We shall say that the former is a description of 
the set and the latter is a listing of the set. For example, we can define 
a set of four people as (a) the members of the quartet which played 
in town last night, or (b) the people whose names are Jones, Smith, 
Brown, and Green. It is customary to use braces to surround the list¬ 
ing of a set; thus the set above should be listed {Jones, Smith, Brown, 
Green}. 

We shall frequently be interested in sets of logical possibilities, since 
the analysis of such sets is very often a major task in the solving of a 
problem. Suppose, for example, that we were interested in the suc¬ 
cesses of three candidates who enter the presidential primaries (we 
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assume there are no other entries). Suppose that the key primaries 
will be held in New Hampshire, Minnesota, Wisconsin, and California. 
Assume that candidate A enters all the primaries, that B does not 
contest in New Hampshire's primary, and C does not contest in 
Wisconsin's. A list of the logical possibilities is given in Figure 1. 
Since the New Hampshire and Wisconsin primaries can each end in 
two ways, and the Minnesota and California primaries can each end 
in three ways, there are in all 2 • 2 • 3 • 3 = 36 different logical possibili¬ 
ties as listed in Figure 1. 

A set that consists of some members of another set is called a subset 
of that set. For example, the set of those logical possibilities in Figure 
1 for which the statement “Candidate A wins at least three primaries" 
is true, is a subset of the set of all logical possibilities. This subset can 
also be defined by listing its members: {PI, P2, P3, P4, P7, P13, P19}. 

In order to discuss all the subsets of a given set, let us introduce 
the following terminology. We shall call the original set the universal 
set, one-element subsets will be called unit sets , and the set which 
contains no members the empty set. We do not introduce special 
names for other kinds of subsets of the universal set. As an example, 
let the universal set C IL consist of the three elements {a, b, c}. The 
proper subsets of 41 are those sets containing some but not all of the 
elements of 41. The proper subsets consist of three two-element sets, 
namely, {a, b}, {a, c} 7 and {6, c } and three unit sets, namely, {a}, 
{b}, and {c}. To complete the picture we also consider the universal 
set a subset (but not a proper subset) of itself, and we consider the 
empty set 8, that contains no elements of 41, as a subset of 41. At first 
it may seem strange that we should include the sets 41 and 8 as subsets 
of 41, but the reasons for their inclusion will become clear later. 

We saw that the three element set above had 8 = 2 3 subsets. In 
general, a set with n elements has 2 n subsets, as can be seen in the 
following manner. We form subsets P of 41 by considering each of 
the elements of 41 in turn and deciding whether or not to include it 
in the subset P. If we decide to put every element of 41 into P we get 
the universal set, and if we decide to put no element of 41 into P we 
get the empty set. In most cases we will put some but not all the 
elements into P and thus obtain a proper subset of 41. We have to 
make n decisions, one for each element of the set, and for each decision 
we have to choose between two alternatives. We can make these 
decisions in 2 • 2 • . . . *2 = 2 n ways, and hence this is the number of 
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Possibility 

Number 

Winner in 
New Hampshire 

Winner in 
Minnesota 

Winner in 
Wisconsin 

Winner in 
California 

PI 

A 

A 

A 

A 

P2 

A 

A 

A 

B 

P3 

A 

A 

A 

C 

P4 

A 

A 

B 

A 

P5 

A 

A 

B 

B 

P6 

A 

A 

B 

C 

P7 

A 

B 

A 

A 

P8 

A 

B 

A 

B 

P9 

A 

B 

A 

C 

P10 

A 

B 

B 

A 

Pll 

A 

B 

B 

B 

P12 

A 

B 

B 

C 

P13 

A 

C 

A 

A 

P14 

A 

C 

A 

B 

P15 

A 

C 

A 

C 

P16 

A 

C 

B 

A 

P17 

A 

C 

B 

B 

P18 

A 

C 

B 

C 

P19 

C 

A 

A 

A 

P20 

C 

A 

A 

B 

P21 

C 

A 

A 

C 

P22 

C 

A 

B 

A 

P23 

C 

A 

B 

B 

P24 

C 

A 

B 

C 

P25 

C 

B 

A 

A 

P26 

C 

B 

A 

B 

P27 

C 

B 

A 

C 

P28 

C 

B 

B 

A 

P29 

C 

B 

B 

B 

P30 | 

C 

B 

B 

C 

P31 

C 

C 

A 

A 

P32 

C 

C 

A 

B 

P33 

C 

C 

A 

C 

P34 

C 

C 

B 

A 

P35 

c 

C 

B 

B 

P36 

c 

C 

B 

i c 


Figure 1 
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different subsets of 'll that can be formed. Observe that our formula 
would not have been so simple if we had not included the universal 
set and the empty set as subsets of 11. 

In the example of the voting primaries above there are 2 36 or about 
70 billion subsets. Of course, we cannot deal with this many subsets 
in a practical problem, but fortunately we are usually interested in 
only a few of the subsets. The most interesting subsets are those 
-which can be defined by means of a simple rule such as “The set of 
all logical possibilities in which C loses at least two primaries.” It 
would be difficult to give a simple description for the subset containing 
the elements {PI, P4, P14, P30, P34}. On the other hand, we sha ll 
see in the next section how to define new subsets in terms of subsets 
already defined. 


Examples. We illustrate the two different ways of specifying sets 
in terms of the primary voting example. Let the universal set 11 be 
the logical possibilities given in Figure 1. 

1. What is the subset of 11 in which candidate B wins more pri¬ 

maries than either of the other candidates? Answer: {Pll P12 P17 
P23, P26, P28, P29). ’ ’ ’ 

2. What is the subset in which the primaries are split two and two? 
Answer: {P5, P8, P10, P15, P21, P30, P31, P35}. 

3. Describe the set {PI, P4, P19, P22}. Answer: The set of possi¬ 
bilities for which A wins in Minnesota and California. 

4 - How can we describe the set {P18, P24, P27}? Answer: The set 
of possibilities for which C wins in California, and the other primaries 
are split three ways. 


EXERCISES 

1. In the primary example, give a listing for each of the following sets. 

(a) The set in which C wins at least two primaries. 

(b) The set in which the first three primaries are won by the same 
candidate. 

(c) The set in which B wins all four primaries. 

2. The primaries are considered decisive if a candidate can win three 
primaries, or if he wins two primaries including California. List the set in 
which the primaries are decisive. 
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3. Give simple descriptions for the following sets (referring to the primary 
example). 

(a) {P33, P36}. 

(b) {P10, Pll, P12, P28, P29, P30}. 

(c) (P6, P20, P22}. 

4. Joe, Jim, Pete, Mary, and Peg are to be photographed. They want to 
line up so that boys and girls alternate. List the set of all possibilities. 

5. In Exercise 4, list the following subsets. 

(a) The set in which Pete and Mary are next to each other. 

(b) The set in which Peg is between Joe and Jim. 

(c) The set in which Jim is in the middle. 

(d) The set in which Mary is in the middle. 

(e) The set in which a boy is at each end. 

6. Pick out all pairs in Exercise 5 in which one set is a subset of the other. 

7. A TV producer is planning a half-hour show. He wants to have a 
combination of comedy, music, and commercials. If each is allotted a multiple 
of five minutes, construct the set of possible distributions of time. (Consider 
only the total time allotted to each.) 

8. In Exercise 7, list the following subsets. 

(a) The set in which more time is devoted to comedy than to music. 

(b) The set in which no more time is devoted to commercials than to 
either music or comedy. 

(c) The set in which exactly five minutes is devoted to music. 

(d) The set in which all three of the above conditions are satisfied. 

9. In Exercise 8, find two sets, each of which is a proper subset of the set 
in (a) and also of the set in (c). 

2. OPERATIONS ON SUBSETS 

In Chapter I we considered the ways in which one could form new 
statements from given statements. Now we shall consider an analo¬ 
gous procedure, the formation of new sets from given sets. We shall 
assume that each of the sets that we use in the combination is a subset 
of some universal set, and we shall also want the newly formed set 
to be a subset of the same universal set. As usual we can specify a 
newly formed set either by a description or by a listing. 

If P and Q are two sets we shall define a new set PH Q, called the 
intersection of P and Q as follows: P C\ Q is the set which contains 
those and only those elements which belong to both P and Q. As an 
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example, consider the logical possibilities listed in Figure 1. Let P be 
the subset in which candidate A wins at least three primaries, i.e., the 
set {PI, P2, P3, P4, P7, P13, P19}; let Q be the subset in which A 
wins the first two primaries, i.e., the set {PI, P2, P3, P4, P5, P6}. 
Then the intersection P C\Q is the set in which both events take 
place, i.e., where A wins the first two primaries and wins at least three 
primaries. Thus P C\ Q is the set {PI, P2, P3, P4}. 

If P and Q are two sets we shall define a new set P U Q called the 
union of P and Q as follows: P U Q is the set that contains those and 
only those elements that belong either to P or to Q (or to both). In the 
example in the paragraph above, the union P U Q is the set of possi¬ 
bilities for which either A wins the first two primaries or wins at least 
three primaries, i.e., the set {PI, P2, P3, P4, P5, P6, P7, P13, P19}. 



Figure 2 


To help in visualizing these operations we shall draw diagrams, 
called Venn diagrams, which illustrate them. We let the universal set 
be a rectangle and let subsets be circles drawn inside the rectangle. 
In Figure 2 we show two sets P and Q as shaded circles. Then the 
doubly crosshatched area is the intersection P r\ Q and the total 
shaded area is the union P\J Q. 

If P is a given subset of the universal set at, we can define a new 
set P called the complement of P as follows: P is the set of all elements 
of Tl that are not contained in P. For example, if, as above, Q is the 
set in which candidate A wins the first two primaries, then Q is the 
set {P7, P8, . . . , P36). The shaded area in Figure 3 is the comple¬ 
ment of the set P. Observe that the complement of the empty set & 
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is the universal set 0L, and also that the complement of the universal 
set is the empty set. 

Sometimes we shall be interested in only part of the complement 
of a set. For example, we might wish to consider the part of the 
complement of the set Q that is contained in P, i.e., the set P r\ Q . 
The shaded area in Figure 4 is P f~\ Q. 



A somewhat more suggestive definition of this set can be given as 
follows: let P — Q be the difference of P and Q , that is, the set that 
contains those elements of P that do not belong to Q. Figure 4 shows 
that P C\Q and P — Q are the same set. In the primary voting ex¬ 
ample above the set P — Q can be listed as {P7, P13, P19}. 

The complement of a subset is a special case of a difference set, 
since we can write Q = 'll — Q. If P and Q are nonempty subsets 
whose intersection is the empty set, i.e., P Pi Q = 8, then we say that 
they are disjoint subsets. 


Examples. In the primary voting example let R be the set in 
which A wins the first three primaries, i.e., the set {PI, P2, P3}; let 
S be the set in which A wins the last two primaries, i.e., the set 
{PI, P7, P13, P19, P25, P31}. Then R P 8 = {PI} is the set in which 
A wins the first three primaries and also the last two, that is he wins 
all the primaries. We also have 

RVJS = {PI, P2, P3, P7, P13, P19, P25, P31}, 

which can be described as the set in which A wins the first three pri¬ 
maries or the last two. The set in which A does not win the first 
three primaries is P = {P4, P5, . . . , P36}. Finally, we see that the^ 





Sec. 2] 


SETS AND SUBSETS 


61 


difference set R — S is the set in which A wins the first three pri¬ 
maries but not both of the last two. This set can be found by taking 
from R the element {PI} which it has in common with S, so that 
R - S = {P2, P3}. 


EXERCISES 

1. Draw Venn diagrams for P Q, P Pi Q, P Q, P C\ Q. 

2. Give a step-by-step construction of the diagram for (P — Q) \J 

(pr\Q). 

3. Venn diagrams are also useful when three subsets are given. Construct 
such a diagram, given the subsets P, Q, and R. Identify each of the eight 
resulting areas in terms of P, Q, and R, 

4. In testing blood, three types of antigens are looked for: A, B, and Rh. 
Every person is classified doubly. He is Rh positive if he has the Rh antigen, 
and Rh negative otherwise. He is type AB, A, or B depending on which of 
the other antigens he has, with type O having neither A nor B. Draw a Venn 
diagram, and identify each of the eight areas. 

5. Considering only two subsets, the set X of people having antigen A, 
and the set Y of people having antigen B, define (symbolically) the types 
AB, A, B, and 0. 

6. A person can receive blood from another person if he has all the 
antigens of the donor. Describe in terms of X and Y the sets of people who 
can give to each of the four types. Identify these sets in terms of blood types. 

7. 



Liked 
very much 

Liked 

slightly 

Disliked 

slightly 

Disliked 
very much 

Men 

1 

3 

5 

10 

Women 

6 

8 

3 

1 

Boys 

5 

5 

3 

2 

Girls 

8 

5 

1 

1 


This tabulation records the reaction of a number of spectators to a tele¬ 
vision show. All the categories can be defined in terms of the following four: 
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M (male), G (grown-up), L (liked), Vm (very much). How many people fall 
into each of the following categories: 


(a) M. 

[Ans. 34.] 

(b) L. 


(c) Vm. 


(d) m n g r\ L r\ vm. 

[Ans. 2.] 

(e) Mr\or\L. 


(f) (M r\G)U (Lr\Vm). 


(g) (m r\ G). 

[Ans. 48.] 

(h) (M yj G). 


(i) (M - G). _ 


(j) [M — (G r\LC\ Vm)]. 



8. In a survey of 100 students, the numbers studying various languages 
were found to be: Spanish, 28; German, 30; French, 42; Spanish and German, 
8; Spanish and French, 10; German and French, 5; all three languages, 3. 

(a) How many students were studying no language? [Ans. 20.] 

(b) How many students had French as their only language? 

[Ans. 30.] 

(c) How many students studied German if and only if they studied 

French? [Ans. 38.] 

[Hint: Draw a Venn diagram with three circles, for French, German, and 
Spanish students. Fill in the numbers in each of the eight areas, using the 
data given above. Start from the end of the list and work back.] 

9. In a later survey of the 100 students (see Exercise 8), numbers study¬ 
ing the various languages were found to be: German only, 18; German but 
not Spanish, 23; German and French, 8; German, 26; French, 48; French 
and Spanish, 8; no language, 24. 

(a) How many students took Spanish? [Ans. 18.] 

(b) How many took German and Spanish but not French? 

[Ans. None.] 

(c) How many took French if and only if they did not take Spanish? 

[Ans. 50.] 

10. The report of one survey of the 100 students (see Exercise 8) stated 
that the numbers studying the various languages were: all three languages, 5; 
German and Spanish, 10; French and Spanish, 8; German and French, 20; 
Spanish, 30; German, 23; French, 50. The surveyor who turned in this report 
was fired. Why? 

11. A recent survey of 100 Dartmouth students has revealed the informa¬ 
tion about their dates that is summarized in the following table. 
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Beautiful 

and 

Intelligent 

Plain 

and 

Intelligent 

Beautiful 

and 

Dumb 

Plain 

and 

Dumb 

Blonde 

6 

9 

10 

20 

Brunette 

7 

11 

15 

9 

Redhead 

2 

3 

8 

0 


Let BL = blondes, BR = brunettes, R = redheads, BE = beautiful girls, 
Z> = dumb girls. Determine the number of girls in each of the following 
classes. 

(a) BL C\ BE r\ D. [. Ans . 10.] 

(b) BR. 

(c) Rnf). 

(d) (. BR yjRUMBE U 5). [Ans. 46.] 

(e) BL\J (BE C\D). 

12, In Exercise 11, which set of each of the following pairs has more girls 
as members? 

(a) (BL VJ BR) or R. 

(b) D r\ BE or BL - (Dr\ BE). 

(c) & or R n BE HD. 

3. THE RELATIONSHIP BETWEEN SETS AND COMPOUND STATEMENTS 

The reader may have observed several times in the preceding sec¬ 
tions that there was a close connection between sets and statements, 
and between set operations and compounding operations. In this sec¬ 
tion we shall formalize these relationships. 

If we have a number of statements under consideration there is a 
natural way of assigning a set to each one of these statements. First 
we form the set of all logical possibilities for the statements under 
consideration and call this set the universal set. Then to each state¬ 
ment we assign the subset of logical possibilities of the universal set 
for which that statement is true. This idea is so important that we 
embody it in a formal definition. 
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Definition. Let p, q, r , . . . be statements and let *11 be their set 
of logical possibilities; let P, Q, R, . . . be the subsets of 01 for which 
statements p, q, r, . . . are respectively true; then we call P, Q, 
R, . . . the truth sets of statements p, q, r , . . . . 

If p and q are statements, then p V q and p A q are also statements 
and hence must have truth sets. To find the truth set of p V q we 
observe that it is true whenever p is true or q is true (or both). There¬ 
fore we must assign to p V q the logical possibilities which are in P 
or in Q (or both); that is, we must assign to p V q the set PU Q. 
On the other hand, the statement p A q is true only when both p and 
q are true, so that we must assign to p A q the set P C\ Q. 

Thus we see that there is a close connection between the logical 
operation of disjunction and the set operation of union, and also 
between conjunction and intersection. A careful examination of the 
definitions of union and intersection show r s that the word “or” occurs 
in the definition of union and the word “and” occurs in the definition 
of intersection. Thus the connection between the two theories is not 
surprising. 

Since the connective “not” occurs in the definition of the comple¬ 
ment of a set, it is not surprising that the truth set of ~p is P. This 
follows since ~p is true when p is false, so that the truth set of ~p 

contains all logical possibilities 
for which p is false, that is, the 
truth set of ~p is P. 

The truth sets of two propo¬ 
sitions p and q are shown in 
Figure 5. Also marked on the 
diagram are the various logical 
possibilities for these two state¬ 
ments. The reader should pick 
out in this diagram the truth 
sets of the statements p V q, 
p A q, ~p, and ~q. 

The connection between a statement and its truth set makes it 
possible to “translate” a problem about compound statements into 
a problem about sets. It is also possible to go in the reverse direction. 
Given a problem about sets, think of the universal set as being a set 
of logical possibilities and think of a subset as being the truth set 



Figure 5 
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of a statement. Hence we can “translate” a problem about sets into 
a problem about compound statements. 

So far we have discussed only the truth sets assigned to compound 
statements involving V, A, and All the other connectives can 
be defined in terms of these three basic ones, so that we can deduce 
what truth sets should be as¬ 


signed to them. For example, 
we know that p —> q is equiva¬ 
lent to ~p V q (see Figure 28 
of Chapter I). Hence the truth 
set of p —> q is the same as the 
truth set of ~p V q , that is, it 
is P\J Q. The Venn diagram for 
p > q is shown in Figure 6, where 
the shaded area is the truth set 
for the statement. Observe that 
the unshaded area in Figure 6 is 
the set P — Q = P P\ Q which 
is the truth set of the statement 



p A ~q. Thus the shaded area is 


the set (P — Q) = P (A Q which is the truth set of the statement 
~[p A ~g]. We have thus discovered the fact ( p —» q), (~p V q ), 
and ~(p A ~q) are equivalent. It is always the case that two com¬ 
pound statements are equivalent if and only if they have the same 
truth sets. We also see that Venn diagrams can be used to discover 
relations between statements. 

Suppose now that p is a statement that is logically true. What is 
its truth set? Now p is logically true if and only if it is true in every 
logically possible case, so that the truth set of p must be C U. Similarly, 
if p is logically false, then it is false for every logically possible case, 
£o that its truth set is the empty set 8. 

Finally, let us consider the implication relation. Recall that p im¬ 
plies q if and only if the conditional p —> q is logically true. But p —> q 


is logically true if and only if its truth set is 01, that is (P — Q) = oi, 
or (P — Q) — 8. From Figure 4 we see that if P — Q is empty, then 
P is contained in Q. We shall symbolize the containing relation as 
follows: P C Q means U P is a subset of QP We conclude that p —» q 
is logically true if and only if P C Q- 

Let us briefly summarize the above discussion. To each statement 
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there corresponds a truth set. To each logical connective there cor¬ 
responds a set operation. To each relation between statements there 
corresponds a relation between the truth sets. The truth sets of the 
statements p V q, p A q, ~p, and p —> q are P VJ Q } P C\ Q, P, and 

(P — Q ), respectively. Statement p is logically true if P = HI and 
logically false if P = 8. Statements p and q are equivalent if and 
only if P = Q, and p implies q if and only if P C Q- 


Example 1. Prove by means of a Venn diagram that the statement 
[p V (~p V g)] is logically true. The assigned set of this statement 
is [P VJ (P W Q)], and its Venn diagram is shown in Figure 7. In 
that figure the set P is shaded vertically, and the set P U Q is shaded 
horizontally. Their union is the entire shaded area which is HI so that 
the compound statement is logically true. 


= . ] 


—- 

1 

111 

pBI 

1 — .. =1 


Figure 7 


Example 2. Prove by means of Venn diagrams that p V (q A r) 
is equivalent to (p V q) A (p V r). The truth set of p V (q A r) is the 
entire shaded area of Figure 8, and the truth set of (p V q) A (p V r) 
is the doubly shaded area in Figure 9. Since these two sets are equal 
we see that the two statements are equivalent. 


Example 3. Show by means of a Venn diagram that q implies 
p —> q. The truth set of p —> q is the shaded area in Figure 6. Since 
this shaded area includes the set Q we see that q implies p —> q. 
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EXERCISES 

Note: In Exercises 1, 2, and 3, find first the truth set of each statement. 

1. Use Venn diagrams to test which of the following statements are 
logically true or logically false. 

(a) p V ~p. 

(b) p A ~p. 

(c) p V (~p A q). 

(d) p-+{q-> p). 

(e) p A p). 

[Arts, (a), (d) logically true; (b), (e) logically false.] 

2. Use Venn diagrams to test the following statements for equivalences. 

(a) p V ~q. 

(b) ~(p A g). 

(c) ~(q A ~p). 

(d) p -> ~g. 

(e) ^p V ^g. 

[A ns. (a) and (c) equivalent; (b) and (d) and (e) equivalent.] 

3. Use Venn diagrams for the following pairs of statements to test whether 
one implies the other. 

(a) p; p A q. 

(b) p A ~q; ~p —■► ~g- 

(c) p —► g; g —> p. 

(d) p A q; p A ~q. 

4. A pair of statements is said to be inconsistent if they cannot both be 
true. Devise a test for inconsistency. 
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5. Three or more statements are said to be inconsistent if they cannot 
all be true. What does this state about their truth sets? 

6. In the following three compound statements (a) assign variables to 
the components, (b) bring the statements into symbolic form, (c) find the 
truth sets, and (d) test for consistency. 

If this is a good course, then I will work hard in it. 

If this is not a good course, then I shall get a bad grade in it. 

I will not work hard, but I will get a good grade in this course. 

[Arts. Inconsistent.] 

Note: In Exercises 7-9 assign to each set a statement having it as a truth 
set. 

7. Use truth tables to find which of the following sets are empty. 

(a ) (PVjQ)r\(p\jQ). 

(b) (p n Q) r\ (Q n R). 

( C ) (pr\Q) -p. 

(d) (PU R) r\ (P U Q). [Ans. (b) and (c).] 

8. Use truth tables to find out whether the following sets are all different. 

(a) P n (Q KJ R). 

(b) (R - Q) VJ (Q - R). 

(c) (BW Q)n (Rr\Q). 

(d) (pn®u(pri r). 

(e) (pngnfi)u (pr\Qr\R)\j (pr\Qr\R)\j 

(pnQnR). 

9. Use truth tables for the following pairs of sets to test whether one is 
a subset of the other. 

(a) P; PC\Q. 

(b) Pr\Q;QC\P. 

(c) P-Q;Q-P. 

(d) Pr\Q;PVQ. 

10. Show, both by the use of truth tables and by the use of Venn diagrams, 
that p A (q V r) is equivalent to (p A q) V (p A r). 

*4. THE ABSTRACT LAWS OF SET OPERATIONS 

The set operations which we have introduced obey some very simple 
abstract laws, which we shall list in this section. These laws can be 
proved by means of Venn diagrams or they can be translated into 
statements and checked by means of truth tables. 



Sec. 4] 


SETS AND SUBSETS 


69 


The abstract laws given below bear a close resemblance to the 
elementary algebraic laws with which the student is already f amiliar 
The resemblance can be made even more striking by replacing VJ by 
+ and n by X. For this reason, a set, its subsets and the laws of 
combination of subsets are considered an algebraie system, called a 
Boolean algebra—after the British mathematician George Boole who 
was the first person to study them from the algebraic point of view. 
Any other system obeying these laws, for example the system of com¬ 
pound statements studied in Chapter I, is also known as a Boolean 
algebra. We can study any of these systems from either the algebraic 
or the logical point of view. 

Below are the basic laws of Boolean algebras. The proofs of these 
laws will be left as exercises. 


The laws governing union and intersection: 


Al. 

AKJ A = A. 

A2. 

A C\ A = A. 

A3. 

A U B = BKJ A. 

A4. 

a r\ b = b r\ a. 

A5. 

4C(BUC) = (4UB)U C. 

A6. 

a r\ (B n c) = (A r\ b) r\ c. 

A7. 

a r\ (bvj c) = (a n B)u (A n c). 

A8. 

A\J (B t~\C) — (A\J B) P\ (A\J C ). 

A9. 


A10. 

A C\ 8 = 8. 

All. 

a r\ n = a. 

A12. 

A\J& = A. 

The laws governing complements : 

Bl. 

1 = A. 

B2. 

iUi = -u. 

B3. 

4f\i = S. 

B4. 

(iUB) = 1 A B. 

B5. 

(Ar\B) = A W B. 

B6. 

€ = 8. 


The laws governing set-differences: 
Cl. A - B = A(~\B. 


70 


SETS AND SUBSETS 


[Chap. II 


C2. 

U - A = A. 

C3. 

A - 'll = 8. 

C4. 

A - S = A. 

C5. 

S - A = 8. 

C6. 

A - A = 8. 

C7. 

(A - B) - C = A - (B KJ C). 

C8. 

A - {B - C) = (A - B)VJ (A r\ C). 

C9. 

AKJ {B - O = (iU B) - (C - A). 

CIO. 

AC\(B - C) = (AC\B) - (Ar\ C). 


EXERCISES 

1. Test laws in the group A1-A12 by means of Venn diagrams. 

2. “Translate” the A-laws into laws about compound statements. Test 
these by truth tables. 

3. Test the laws in groups B and C by Venn diagrams. 

4. “Translate” the B- and C-laws into laws about compound statements. 
Test these by means of truth tables. 

5. Derive the following results from the 28 basic laws. 

(a) A = (A D B) U (A r\ B). 

(b ) A VJ B = (A r\ B) (A n B) yj (A r\ B). 

(c) A r\ (A \J B) = A. 

(d ) AU (lr\B) = AU B. 

6. From the A- and B-laws and from Cl, derive C2-C6. 

7. Use A1-A12 and C2-C10 to derive Bl, B2, B3, and B6. 

5. TWO-DIGIT NUMBER SYSTEMS 

In the decimal number system one can write any number by using 
only the ten digits, 0, 1, 2, . . . , 9. Other number systems can be 
constructed which use either fewer or more digits. Probably the 
simplest number system is the binary number system which uses only 
the digits 0 and 1. We shall consider all the possible ways of forming 
number systems using only these two digits. 

The two basic arithmetical operations are addition and multiplica¬ 
tion. To understand any arithmetic system, it is necessary to know 
how to add or multiply any two digits together. Thus to understand 
the decimal system, we had to learn a multiplication table and an 
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addition table each of which had 100 entries. To understand the 
binary system we have to learn a multiplication and an addition 
table, each of which has only 
four entries. These are shown 
in Figure 10. 

The multiplication table given 
there is completely determined 
by the two familiar rules that 
multiplying a number by zero 


+ 

0 1 

• 

0 1 

0 

0 1 

0 

0 o 

1 

1 ? 

1 

0 1 


Figure 10 


gives zero, and multiplying a number by one leaves it unchanged. 
For addition we have only the rule that the addition of zero to a 
number does not change that number. The latter rule is sufficient to 
determine all but one of the entries in the addition table in Figure 
10. We must still decide what shall be the sum 1 + 1. 

What are the possible ways in which we can complete the addition 
table? The only one digit numbers that we can use are 0 and 1, and 
these lead to interesting systems. Of the possible two-digit numbers 
we see that 00, 01 are the same as 0 and 1 and so do not give any¬ 
thing new. The number 11 or any greater number would introduce a 
“jump” in the table, hence the only other possibility is 10. The 
addition tables of these three different number systems are shown in 
Figure 11, and they all have the multiplication table shown in Figure 
10. Each of these systems is interesting in itself as the interpretations 
below show. 

Let us say that the parity of a positive integer is the fact of it being 
odd or even. Consider now the number system having the addition 
table in Figure 11(a) and let 0 represent “even” and 1 represent 


+ 

0 1 

0 

0 1 

1 

1 0 


la) 


+ 

0 1 

0 

0 1 

1 

1 1 


lb) 


Figure 11 


+ 

0 1 

0 

0 1 

1 

1 10 


Ic) 


“odd.” The tables above now tell how the parity of a combination 
of two positive integers is related to the parity of each. Thus 0-1=0 
tells us that the product of an even number and an odd number is 
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even, while 1 + 1=0 tells us that the sum of two odd numbers is 
even, etc. Thus the first number system is that which we get from the 
arithmetic of the positive integers if we consider only the parity of 
numbers. 

The second number system, which has the addition table in Figure 
11(b), has an interpretation in terms of sets. Let 0 correspond to the 
empty set 8 and 1 correspond to the universal set c ll. Let the addition 
of numbers correspond to the union of sets and let the multiplication 
of sets correspond to the intersection of sets. Then 0*1=0 tells us that 
£ n Tl = 8 and 1 + 1 = 1 tells us that c llU c ll = c U. The student 
should give the interpretations for the other arithmetic computations 
possible for this number system. 

Finally, the third number system, which has the addition table 
in Figure 11(c), is the so-called binary number system . Every ordinary 
integer can be written as a binary integer. Thus the binary 0 cor¬ 
responds to the ordinary 0 , and the binary unit 1 to the ordinary 
single unit. The binary number 10 means a “unit of higher order” 
and corresponds to the ordinary number two (not to ten). The bi¬ 
nary number 100 then means two times two or four. In general, if 
bnbn -1 ... 626160 is a binary number, where each digit is either 0 or 1 , 
then the corresponding ordinary integer I is given by the formula 

I = 6 n * 2 n + 6 n- 1 , 2 n_1 + . . . + 6 2 * 2 2 + 6 i *2 + 6 0 . 

Thus the binary number 11001 corresponds to 2 4 + 2 s + 1 = 
16 + 8 + 1 =25. The table in Figure 12 shows some binary num¬ 
bers and their integer equivalents. 


Binary 

number 

1 

10 

11 

100 

101 

110 

111 

1000 

10000 

100000 

Integer 

1 

2 

3 

4 

5 

6 

7 

8 

16 

32 


Figure 12 


Because electronic circuits are particularly well adapted to perform¬ 
ing computations in the binary system, modern high-speed electronic 
computers are frequently constructed to work in the binary system. 
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Example. As an example of a computation, let us multiply 5 by 5 
in the binary system. Since the binary equivalent of 5 is the number 
101 the multiplication is done as follows. 

101 

101 

101 

000 

101 

11001 

The answer is the binary number 11001, which we saw above was 
equivalent to the integer 25, the answer we expected to get. 

EXERCISES 

1. Complete the interpretations of the addition and multiplication tables, 
for the number systems representing (a) parity, (b) the sets "11 and 8. 

2. (a) What are the binary numbers corresponding to the integers 11, 52, 

64, 98, 128, 144? [ Ans . 1100010 corresponds to 98.] 

(b) What integers correspond to the binary numbers 1111, 1010101, 
1000000, 11011011? [Ans. 1010101 corresponds to 85.] 

3. Carry out the following operations in the binary system. Check your 
answer. 

(a) 29 + 20. 

(b) 9*7. 

4. Of the laws listed below, which apply to each of the three systems? 

(a) x + y = y + x. 

(b) x + x = x. 

(c) x + x + x — x. 

5. Interpret a + b to be the larger of the two numbers a and 6, and a • b 
to be the smaller of the two. Write tables of “addition” and “multiplication” 
for the digits 0 and 1. Compare the result with the three systems given above. 

[Ans. Same as the C IL, 8 system.] 

6. What do the laws A1-A10 of the last section tell us about the second 
number system established above? 

7. The first number system above (about parity) can be interpreted to 
deal with the remainders of integers when divided by 2. An even number 
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leaves 0, an odd number leaves 1. Construct tables of addition and multipli¬ 
cation for remainders of integers when divided by 3. [Hint: These will be 
3X3 tables.] 

8. Given a set of four elements, suppose that we want to number its 
subsets. For a given subset write down a binary number as follows: The 
first digit is 1 if and only if the first element is in the subset, the second digit 
is 1 if and only if the second element is in the subset, etc. Prove that this 
assigns a unique number, from 0 to 15, to each subset. 

9. In a multiple choice test the answers were numbered 1, 2, 4, and 8. 
The students were told that there might be no correct answer, or that one or 
more answers might be correct. They were told to add together the numbers 
of the correct answers (or to write 0 if no answer was correct). 

(a) By using the result of Exercise 8, show that the resulting number 
gives the instructor all the information he wants. 

(b) On a given question the correct sum was 7. Three students put 

down 4, 8, and 15, respectively. Which answer was most nearly 
correct? Which answer was worst? [ Ans . 15 best, 8 worst.] 

10. In the ternary number system numbers are expressed to the base 3, so 
that 201 in this system stands for 2-3 2 + 0-3 + 1-1 = 19. 

(a) Write the numbers from 1 through 30 in this notation. 

(b) Construct a table of addition and multiplication for the digits 0, 

1 , 2 . 

(c) Carry out the multiplication of 5-5 in this system. Check your 
answer. 

11. Explain the meaning of the numeral '2907* in our ordinary (base 10) 
notation, in analogy to the formula I given for the binary system. 

*6. VOTING COALITIONS 

As an application of our set concepts we shall consider the signifi¬ 
cance of voting coalitions in voting bodies. Here the universal set is 
a set of human beings which form a decision-making body. For ex¬ 
ample, the universal set might be the members of a committee, or of 
a city council, or of a convention, or of the House of Representatives, 
etc. Each member can cast a certain number of votes. The decision 
as to whether or not a measure is passed can be decided by a simple 
majority rule, or § majority, etc. 

Suppose now that a subset of the members of the body forms a 
coalition in order to pass a measure. The question is whether or not 
they have enough votes to guarantee passage of the measure. If they 
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have enough votes to carry the measure, then we say they form a 
winning coalition . If the members not in the coalition can pass a meas¬ 
ure of their own, then we say that the original coalition is a losing 
coalition. Finally, if the members of the coalition cannot carry their 
measure, and if the members not in the coalition cannot carry their 
measure, then the coalition is called a blocking coalition. 

Let us restate these definitions in set-theoretic terms. A coalition 
C is winning if they havejmough votes to carry an issue; coalition C 
is losing if the^coalition C is winning; and coalition C is blocking if 
neither C nor G is a winning coalition. 

The following facts are immediate consequences of these defini¬ 
tions. The complement of a winning coalition is a losing coalition. 
The complement of a losing coalition is a winning coalition. The com¬ 
plement of a blocking coalition is a blocking coalition. 

Example 1. A committee consists of six men each having one vote. 
A simple majority vote will carry an issue. Then any coalition of four 
or more members is winning, any coalition with one or two members 
is losing, and any three-person coalition is blocking. 

Example 2. Suppose in Example 1 one of the six members (say 
the chairman) is given the power to break ties. Then any three-person 
coalition of which he is a member is winning, while the other three- 
person coalitions are losing; hence there are no blocking coalitions. 
The other coalitions are as in Example 1. 

Example 3. Let the universal set 01 be the set {x, y, w, z}, where 
x and y each has one vote, w has two votes, and z has three votes. 
Suppose it takes five votes to carry a measure. Then the winning 
coalitions are: {z, w}, {z, x, y}, {z, w, x}, {z, w, y}, and 01. The losing 
coalitions are the complements of these sets. Blocking coalitions are' 
{z}, {z, x}, {z, y}, {w, x}, {w, y}, and {w, x, y}. 

The last example shows that it is not always necessary to list all 
members of a winning coalition. For example, if the coalition {z, w} 
is winning, then it is obvious that the coalition {z, w, y} is also win¬ 
ning. In general, if a coalition C is winning, then any other set that 
has C as a subset will also be winning. Thus we are led to the notion 
of a minimal winning coalition. A minimal winning coalition is a 
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winning coalition which contains no smaller winning coalition as a 
subset. Another way of stating this is that a minimal winning coali¬ 
tion is a winning coalition such that, if any member is lost from the 
coalition, then it ceases to be a winning coalition. 

If we know the minimal winning coalitions, then we know every¬ 
thing that we need to know about the voting problem. The winning 
coalitions are all those sets that contain a minimal winning coalition, 
and the losing coalitions are the complements of the winning coali¬ 
tions. All other sets are blocking coalitions. 

In Example 1 the minimal winning coalitions are the sets containing 
four members. In Example 2 the minimal winning coalitions are the 
three-member coalitions that contain the tie-breaking member and 
the four-member coalitions that do not contain the tie-breaking mem¬ 
ber. The minimal winning coalitions in the third example are the sets 
{z, w} and {z, x, y}. . 

Sometimes there are committee members who have special powers 
or lack of power. If a member can pass any measure he wishes without 
needing any one else to vote with him, then we call him a dictator. 
Thus member x is a dictator if and only if {x} is a winning coalition. 
A somewhat weaker but still very powerful member is one who can 
by himself block any measure. If x is such a member, then we say 
that x has veto power. Thus x has veto power if and only if {x} is a 
blocking coalition. Finally if x is not a member of any minimal win¬ 
ning coalition, we shall call him a powerless member. Thus x is power¬ 
less if and only if any winning coalition of which x is a member is a 
winning coalition without him. 

Example 4. An interesting example of a decision-making body is 
the Security Council of the United Nations. The Security Council 
has eleven members consisting of the five permanent large-nation 
members called the Big Five, and six small nation members. In order 
that a measure be passed by the Council, seven members including 
all of the Big Five must vote for the measure. Thus the seven-member 
sets made up of the Big Five plus two small nations are the minimal 
winning coalitions. Then the losing coalitions are the sets that con¬ 
tain at most four small nations. The blocking coalitions are the sets 
that are neither winning nor losing. In particular, a unit set that 
contains one of the Big Five as a member is a blocking coalition. 
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This is the sense in which a Big Five member has a veto. [The pos¬ 
sibility of “abstaining” is immaterial in the above discussion.] 

EXERCISES 

1. A committee has w, x, y, and z as members. Member w has two votes, 
the others have one vote each. List the winning, losing, and blocking co¬ 
alitions. 

2. A committee has n members, each with one vote. It takes a majority 
vote to carry an issue. What are the winning, losing, and blocking coalitions? 

. The Board of Estimate of New York City consists of eight members 
with voting strength as follows: 


s. 

Mavor. 

.. 3 votes 

t. 

Controller. 

.. 3 

u. 

Council President. 

.. 3 

v. 

Brooklyn Borough President .... 

.. 2 

w. 

Manhattan Borough President. . . 

.. 2 

X. 

Bronx Borough President. 

.. 1 

y- 

Richmond Borough President.... 

.. 1 

z. 

Queens Borough President. 

.. 1 


A simple majority is needed to carry an issue. List the minimal winning 
coalitions. List the blocking coalitions. Do the same if we give the mayor 
the additional power to break ties. 

4. A company has issued 100,000 shares of common stock and each share 

has one vote. How many shares must a stockholder have to be a dictator? 
How many to have a veto? [Ana. 50,001; 50,000.] 

5. In Exercise 4, if the company requires a § majority vote to carry an 

issue, how many shares must a stockholder have to be a dictator or to have 
a ve t°? [Ana. At least 66,667; at least 33,334.] 

6. Prove that if a committee has a dictator as a member, then the re¬ 
maining members are powerless. 

7. We can define a maximal losing coalition in analogy to the minimal 
winning coalitions. What is the relation between the maximal losing and 
minimal winning coalitions? Do the maximal losing coalitions provide all 
relevant information? 

8. Prove that any two minimal winning coalitions have at least one 
member in common. 
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9. Find all the blocking coalitions in the Security Council example. 

10. Prove that if a man has veto power and if he together with any one 
other member can carry a measure, then the distribution of the remaining 
votes is irrelevant. 
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PARTITIONS AND COUNTING 


1. PARTITIONS 

The problem to be studied in this chapter can be most conveniently 
described in terms of partitions of a set. A partition of a set ^ is a sub¬ 
division of the set into subsets that are disjoint and exhaustive, i.e., 
every element of *11 must belong to one and only one of the subsets. 
The subsets A t in the partition are called cells. Thus [A h A 2 ,, A r ] 
is a partition of <11 if two conditions are satisfied: (1) A * C\ Aj = 8 if 
i ^ j (the cells are disjoint) and (2) Ai U A 2 U . . . U A r = <U (the 
cells are exhaustive). 

Example 1. If 01 = {a, b, c, d, e}, then [{a, &}, {c, d , e}] and 
[{6, c, e}, {a}, {d}] and [{a}, {6}, {c}, {dj, {c}] are three different 
partitions of 01. The last is a partition into unit sets. 

The process of going from a fine to a less fine analysis of a set of 
logical possibilities is actually carried out by means of a partition. 
For example, let us consider the logical possibilities for the first three 
games of the World Series if the Yankees play the Dodgers. We can 
list the possibilities in terms of the winner of each game as 

{YYY, YYD, YDY, DYY, YDD, DDY, DYD, DDD}. 

We form a partition by putting all the possibilities with the same 
number of wins for the Yankees in a single cell, 

[{YYY}, {YYD, YDY, DYY}, {YDD, DDY, DYD}, {DDD}]. 
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Thus, if we wish the possibilities to be Yankees win three games, 
win two, win one, win zero, then we are considering a les3 detailed 
analysis obtained from the former analysis by identifying the pos¬ 
sibilities in each cell of the partition. 

If [A h At, . . . , A r ] and [B h B 2 , 1B z , . . . , B s ] are two partitions of 
the same set 01, we can obtain a new partition by considering the 
collection of all subsets of 01 of the form AiC\ B 3 (see Exercise 7). 
This new partition is called the cross-partition of the original two 
partitions. 

Example 2. A common use of cross-partitions is in the problem 
of classification. For example, from the set 01 of all life forms we can 
form the partition [P, A] where P is the set of all plants and A is the 
set of all animals. We may also form the partition [E, F ] where E is 
the set of extinct life forms and F is the set of all existing life forms. 
The cross-partition 

[Pr\E } PHF, A C\E, A C\ F] 

gives a complete classification according to the two separate classifi¬ 
cations. 

Many of the examples with which we shall deal in the future will 
relate to processes which take place in stages. It will be convenient 
to use partitions and cross-partitions to represent the stages of the 
process. The graphical representation of such a process is, of course, 
a tree. For example, suppose that the process is such that we learn 
in succession the truth values of a series of statements relative to a 
given situation. If IL is the set of logical possibilities for the situation, 
and p is a statement relative to < U, then the knowledge of the truth 
value of p amounts to knowing which cell of the partition [P, P] 
contains the actual possibility. Recall that P is the truth set of p, 
and P is the truth set of Suppose now we discover the truth 
value of a second statement q. This information can again be de¬ 
scribed by a partition, namely, [ Q , Q]. The two statements together 
give us information which can be represented by the cross-partition 
of these two partitions, 

[paq, pc\Q, Pr\Q, Pr\Q}. 
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That is, if we know the truth values of p and q, we also know which 
of the cells of this cross-partition contain the particular logical pos- 
sibility describing the given situation. Conversely, if we knew which 
cell contained the possibility, we would know the truth values for the 
statements p and q. 

The information obtained by the additional knowledge of the truth 
value of a third statement r, having a truth set R, can be represented 
by the cross-partition of the three partitions [P, P], [Q, Q], [R j />]. 
This cross-partition is 

{Pr\Qr\R, pr\Qt^R, pr\Qr\R, Pr\Q r\R, 

pr\Qr\R, Pr\Qr\R, Pr\Qr\R, pr\$r\ R} 

Notice that now we have the possibility narrowed down to being in 
one of 8 = 2 3 possible cells. Similarly, if we knew the truth values 
of n statements, our partition would have 2™ cells. 

If the set <U were to contain 2 20 (approximately one million) logical 
possibilities, and if we were able to ask yes-no questions in such a way 
that the knowledge of the truth value of each question would cut the 
number of possibilities in half each time, then we could determine in 
20 questions any given possibility in the set 01. We could accomplish 
this kind of questioning, for example, if we had a list of all the pos¬ 
sibilities and were allowed to ask “Is it in the first half?,” and, if the 
answer is yes, then “Is it in the first one-fourth?” etc. In practice 
we ordinarily do not have such a list, and we can only approximate 
this procedure. 


Example 3. In the familiar radio game of twenty questions it is 
not unusual for a contestant to try to carry out a partitioning of the 
above kind. For example, he may know that he is trying to guess a 
city. He might ask, “Is the city in North America?” and if the answer 
is yes, “Is it in the United States?” and if yes, “Is it west of the 
Mississippi?” and if no, “Is it in the New England states?” etc. Of 
course, the above procedure does not actually divide the possibilities 
exactly in half each time. The more nearly the answer to each ques¬ 
tion comes to dividing the possibilities in half, the more certain one 
can be of getting the answer in twenty questions, if there are at most 
a million possibilities. 
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EXERCISES 

1. If OL is the set of integers from 1 to 6, find the cross-partitions of the 
following pairs of partitions. 

(a) [{1, 2, 3}, {4, 5, 6}] and [{1, 4}, {2, 3, 5, 6}]. 

(b) [{I, 2, 3, 4, 5}, {6}] and [{1, 3, 5}, {2, 6}, {4}]. 

[Ans. (a) {1}, {2,-3}, {4}, {5, 6}.] 

2. A coin is thrown three times. List the possibilities according to which 
side turns up each time. Give the partition formed by putting in the same 
cell all those possibilities for which the same number of heads occur. 

3. Let p and q be two statements with truth set P and Q. What can be 
said about the cross-partition of [P, P] and [Q, Q] in the case that: ^ 

(a) p implies q. [Ans. P Q = 8.] 

(b) p is equivalent to q. 

(c) p and q are inconsistent. 

4. Consider the set of eight states consisting of Illinois, Colorado, Michi¬ 
gan, New York, Vermont, Texas, Alabama, and California. 

(a) Show that in three “yes” or “no” questions one can identify any 
one of the eight states. 

(b) Design a set of three “yes” or “no” questions which can be an¬ 
swered independently of each other and which will serve to identify 
any one of the states. 

5. An unabridged dictionary contains about 600,000 words and 3000 
pages. If a person chooses a word from such a dictionary, is it possible to 
identify this word by twenty “yes” or “no” questions? If so, describe the 
procedure that you would use and discuss the feasibility of the procedure. 

[Ans. One solution is the following. Use 12 questions to locate the 

page, but then you may need 9 questions to locate the word.] 

6. Mr. Jones has two parents, each of his parents had two parents, each 
of these had two parents, etc. Tracing a person’s family tree back 40 gener¬ 
ations (about 1000 years) gives Mr. Jones 2 40 ancestors, which is more 
people than have been on the earth in the last 1000 years. What is wrong 
with this argument? 

7. Let [Aij A 2 , A*] and [Pi, P 2 ] be two partitions. Prove that the cross¬ 
partition of the two given partitions really is a partition, that is, it satisfies 
requirements (1) and (2) for partitions. 

8. The cross-partition formed from the truth sets of n statements has 
2 n cells. As seen in Chapter I, the truth table of a statement compounded 
from n statements has 2 n rows. What is the relationship between these two 
facts? 
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9. Let p and qbe statements with truth sets P and Q. Form the partition 
\_P (~\Q, P Q, P (~\ Q, P (~\ Q]. State in each case below which of the cells 
must be empty in order to make the given statement a logically true state¬ 
ment. 

(a) p-+q. 

(b) p q. 

(c) p V 

(d) p. 

10. A partition [A\, A2, . . ., A n \ is said to be a refinement of the partition 
\Bi, i?2,.. ., B m ~\ if every Aj is a subset of some B^. Show that a cross¬ 
partition of two partitions is a refinement of each of the partitions from which 
the cross-partition is formed. 

11. Consider the partition of the people in the United States determined 
by classification according to states. The classification according to county 
determines a second partition. Show that this is a refinement of the first 
partition. Give a third partition which is different from each of these and is a 
refinement of both. 

12. What can be said concerning the cross-partition of two partitions, one 
of which is a refinement of the other? 

13. Given nine objects, of which it is known that eight have the same 
weight and one is heavier, show how, in two weighings with a pan balance, 
the heavy one can be identified. 

14. Suppose that you are given thirteen objects, twelve of which are the 
same, but one is either heavier or lighter than the others. Show that, with 
three weighings using a pan balance, it is possible to identify the odd object. 
[A complete solution to this problem is given on page 42 of Mathematical 
Snapshots, second edition, by H. Steinhaus.] 

15. A subject can be completely classified by introducing several simple 
subdivisions and taking their cross-partition. Thus, courses in college may be 
partitioned according to subject, level of advancement, number of students, 
hours per week, interests, etc. For each of the following subjects, introduce 
five or more partitions. How many cells are there in the complete classifica¬ 
tion (cross-partition) in each case? 

(a) Detective stories. (b) Diseases. 

*2. APPLICATIONS 

Here we shall give three applications showing how partitions can 
be used to describe three different situations in mathematical terms. 
Examples like these will be more fully developed in later chapters. 
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Example 1. A simple game. Smith and Jones play the following 
game: Jones is to hold concealed in his hand either a SI or a $2 bill. 
Smith is to guess which it is and gets the bill if he guesses correctly. 
We shall consider in a later chapter the amount Smith should pay 

to play the game in order to make 
it fair, but at the moment we are 
interested only in describing the 
possibilities for the play of the 
game. The game develops in two 
stages. First Jones chooses a $1 
or a $2 bill, and secondly Smith 
guesses either 1 or 2. We can rep¬ 
resent the ways that these stages 
can be carried out by a tree with 
four branches shown in Figure 1. 
The four ways that the game can 
be played are represented by the four paths of the tree which we 
denote by ai, a 2 , a 3 , a 4 . 

We can also represent the progress of the game by a sequence of 
three partitions, 

Start [{a h a 2 , a 3 , a 4 }] 

Jones , Choice [{a h a 2 }, {a 3 , a 4 }] 

Smith's Choice [{ai}, {a 2 }, {a 3 }, {a 4 }]. 

Notice that we have associated a partition with each level of the tree. 
A cell of the partition associated with a particular level contains all 
paths going through each branching point at this level. 

We can also use partitions to indicate the amount of control which 
each player has on the outcome. The control of Jones can be indi¬ 
cated by the partition [{a x , a 2 }, {a 3 , a 4 }]. That is, Jones can deter¬ 
mine which of the two cells of this partition will contain the play 
of the game. Similarly Smith can control the cell of the partition 
[{ai, a 3 }, {a 2 , a 4 }] that will contain the play. The final partition is 
the cross-partition of these two partitions. 

Example 2* An example from psychology. Suppose that a psycholo¬ 
gist conducts the following experiment with a group of rats. Each 
rat is allowed in each trial to go through a T-maze of a type indicated 


Jones’ Smith’s 

choice choice 
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in Figure 2. If the rat chooses to go right, he will be fed, but if he 
goes left, he will not be fed. In other experiments different feeding 
schedules are used. For example, the 

food may always be put on the right, , „ , 

•4. r ’ x , L . r . , No food Food 

or it may be put on the right two- _ _ 

thirds (or any other fraction) of the 

time. The food might even be put 

each time on the side opposite the 

side to which the rat went the pre- L— 

ceding trial. A psychologist is inter- Figure 2 

ested in predicting the behavior of 

the group of rats subjected to a sequence of such trials. 

Consider an experiment of the above kind and denote by <11 the set 
of all rats used in the experiment. After running each of the rats 
through the maze, we can form a partition by putting in one cell the 
rats which went right, and in the other those which went left. Thus 
each trial of the experiment determines a partition of c ll. The parti¬ 
tion associated with the nth trial we denote by [R n , L n ], R n being 
the set of rats which went right on the nth trial, and L n those which 
went left. A psychologist would like to predict certain properties of 
the partitions after a large number of experiments. For example, 
questions he might ask are the following: if the food is always placed 
on the right, will R n eventually become all, or almost all, of <11? In 
other words, will the rats have “learned” to go right and be fed? 
What will happen if each rat is fed two-thirds of the time on the right 
and one-third on the left? What will happen if the experimenter does, 
on each trial, the opposite of what the rat did on the preceding trial? 

Example 3. Small group behavior . Some sociologists study the be¬ 
havior of a small group of people which has been given the job of 
j ointly solving a problem. An example of this is a jury trying to decide 
the fate of a prisoner. Before a decision is reached, there is a good deal 
of discussion and argument among the members of the group, and 
experiments have been designed to study the role of each person in 
such a situation. For these experiments, observers record the name 
of the person making each remark together with the name of the 
person to whom the remark is directed. Sometimes the nature of the 
remark is recorded and also the time when it is made. 
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Consider an experiment of the above kind performed with four 
people, a, b, c, d. Let "U be the set of all remarks made. Form the 
partition [$ a , Sb, S c , S&] of < U where $ a is the set of all remarks made 
by a, Sb the set of all remarks made by b, etc. Form also the partition 
[r„ 2b, t c , Td] of <11 where T & is the set of all remarks received by a, 
T b is the set of all remarks received by b, etc. A sociologist is inter¬ 
ested, for example, in the following question. Order the cells of the 
S partition according to the number of elements in each cell. If we 
do the same for the T partition, will the order be the same? That is, 
does the person who makes the most remarks also receive the most, 
and the one who makes the second most receive the second most, etc.? 

A second problem is the following. Suppose that a partition 
[Ui y U 2 , Uz] of *11 is made, where U\ is the set of remarks made in a 
first interval of time, U 2 those made in a second interval of time, and 
Uz those made in a third and final interval of time. Then if we form 
the cross-partition of this partition with each of the previous two 
partitions we will have a finer analysis which shows how the discussion 
changes in time. It might show, for example, that the discussion had 
changed from a three-way to a two-way discussion. It could also 
happen that eventually one person had made many remarks and re¬ 
ceived few. The nature of the partitions will of course depend upon 
the particular group of subjects and the particular experiment. 

EXERCISES 

1. Jones has two pennies and Smith has one. They agree to match pennies 

three times or until one of them has no pennies, whichever happens first. 
Draw a tree to represent the possible plays for the game. Show the progress 
of the game by a sequence of partitions. [Ans. There are four paths.] 

2. In Example 2, what information is obtained from the cross-partition 
of the partitions [R h LJ and [R 2) L 2 ]? 

3. Suppose that, in Example 2, the psychologist makes the following 
assumptions concerning the behavior of the rats subjected to a particular 
feeding schedule. For any particular trial, 80 per cent of the rats that went 
right on the previous experiment will go right on this trial, and 60 per cent 
of those that went left on the previous experiment will go right on this trial. 

(a) If 50 per cent went right on the first trial, what per cent would 
the psychologist predict for the third trial? [Ans. 74 per cent.] 

(b) If 75 per cent went right on the first trial, what per cent would he 
predict for the second? For the third? For the hundredth? 
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4. Construct a tree to represent the 16 possibilities for four runs of a 
rat through the T-maze. Check the possibilities in which the rat is successful 
at least three times if: 

(a) The food is always on the right. 

(b) The food is first on the right, then on the left, then on the right, 
and then on the left. 

(c) The food is first on the right, then moved to the side opposite 
from the one that the rat went to the last time. 

[Arts. There are five such paths.] 

5. In Exercise 4, if the rat knows that one of the three possible methods 

of feeding is being used, how can he assure himself of three feedings in four 
tries? \ [Arts. Go right, left, right, right.] 

6. In Example 3 on small group behavior, what information can be 
obtained from the cross-partition of [S & , S h , S c , S d ] and [T a , T h , T c , T d ]? 

7. Suppose that in Example 3, a partition [V, V] of 01 is made, where 
V is the set of all remarks that were made in the form of a question. What 
information can be obtained from the cross-partition of [F, V] and 
[S & , Sb, S c , S d ]? From the cross-partition of [ V , V] and [U h U 2 , U 3 ]? 

8. Assume that every man is classified as a Republican or a Democrat. 
Let us start with a partition of the men of a given generation. Assume that 
we obtain a similar partition for their sons, and for their grandsons, etc., 
for several generations. What might be the questions a political scientist 
would wish to investigate, using these partitions? 

9. Assume that in a given generation x men are Republicans and y are 
Democrats. Assume further that it is known that 20 per cent of the sons 
of Republicans are Democrats and 30 per cent of the sons of Democrats are 
Republicans in any generation. What conditions must x and y satisfy if 
there are to be the same number of Republicans in each generation? Assume 
that the total number of men remains at 50 million in each generation. Is 
there more than one choice for x and y? If not, what must x and y be? 

[Ans. There are 30 million Republicans.] 

10. Assume that there are 30 million Democrats and 20 million Republican 
men in the country. It is known that p per cent of the sons of Democrats 
are Republicans, and q per cent of the sons of Republicans are Democrats. 
If the total number of men remains 50 million, what condition must p and q 
satisfy so that the number in each party remains the same? Is there more 
than one choice of p and q? 
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3. THE NUMBER OF ELEMENTS IN A SET 

The remainder of this chapter will be devoted to certain counting 
problems. For any set X we shall denote by n{X) the number of 
elements in the set. 

Suppose we know the number of elements in certain given sets and 
wish to know the number in other sets related to these by the opera¬ 
tions of unions, intersections, and complementation. As an example, 
consider the following problem. 

Suppose that we are told that 100 students take mathematics, and 
150 students take economics. Can we then tell how many take either 
mathematics or economics? The answer is no, since clearly we would 
also need to know how many students take both courses. If we know 
that no student takes both courses, i.e., if we know that the two sets 

of students are disjoint, then the 
answer would be the sum of the 
two numbers or 250 students. 

In general, if we are given dis- 
j oint sets A and E, then it is true 
that n(A U B) = n(A) + n(E). 
Suppose now that A and B are 
not disjoint as shown in Figure 
3. We can divide the set A into 
Figure 3 disjoint sets APE and A C\ B. 

Similarly we can divide B into 
the disjoint sets A P B and APE. Thus, 

n(A) = n(A P E) + n(A P B) 
n(E) = n(A P B) + n(A P B) 

Adding these two equations, we obtain 

n(A) + n(B) = n{A P E) + n(A P B) + 2 n(A P B). 

Since the sets A P E, APE, and APE are disjoint sets whose 
union is A U E, we obtain the formula 

n(A V B) = n(A) + n(B) - n(A P E) 
which is valid for any two sets A and E. 
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Example 1. Let p and q be statements relative to a set <11 of logical 
possibilities. Denote by P and Q the truth sets of these statements. 
The truth set of p V q is P U Q and the truth set of p A q is P C\ Q . 
Thus the above formula enables us to find the number of cases where 
V V q is true if we know the number of cases for which p, q , and 
p A q are true. 

Example 2. More than two sets. It is possible to derive formulas 
for the number of elements in a set which is the union of more than 
two sets (see Exercise 6), but usually it is easier to work with Venn 
diagrams. For example, suppose that the registrar of a school reports 
the following statistics about a group of 30 students: 

19 take mathematics. 

17 take music. 

11 take history. 

12 take mathematics and music. 

7 take history and mathematics. 

5 take music and history. 

2 take mathematics, history, and music. 

We draw the Venn diagram in 
Figure 4 and fill in the num¬ 
bers for the number of elements 
in each subset working from the 
bottom of our list to the top. 

That is, since 2 students take 
all three courses, and 5 take 
music and history, then 3 take 
history and music but not math¬ 
ematics, etc. Once the diagram 
is completed we can read off the 
number which take any com¬ 
bination of the courses. For 
example, the number which 
take history but not mathematics is 3 + 1 = 4. 

Example 3. Cancer studies . The following reasoning is often found 
in statistical studies on the effect of smoking on the incidence of lung 
cancer. Suppose a study has shown that the fraction of smokers 




90 


PARTITIONS AND COUNTING 


[Chap. Ill 


among those who have lung cancer is greater than the fraction of 
smokers among those who do not have lung cancer. It is then asserted 

that the fraction of smokers 
who have lung cancer is greater 
than the fraction of nonsmokers 
who have lung cancer. Let us 
examine this argument. 

Let S be the set of all smokers 
in the population, and C be the 
set of all people with lung 
cancer. Let a = n(S C\ C), b = 
n(S r\ C ), c = n(S H (7), and 
d = n(Sr\ C), as indicated in 
Figure 5. The fractions in 

which we are interested are 

pl = ^+V P2 = 7+~ d ’ p > = 7+- c ’ pi = bTd’ 

wLere pi is the fraction of those with lung cancer that smoke, p 2 the 
fraction of those without lung cancer that smoke, pz the fraction of 
smokers who have lung cancer, and pa the fraction of nonsmokers 
who have cancer. 

The argument above states that if pi > p 2 , then p 3 > p 4 - The hy- 
d c 

pothesis, - ■ —^ is true if and only if ac + ad > ac + be , 

that is, if and only if ad > be. The conclusion — 7 — > ; 7 7 is true 
J J a + e b + d 

if and only if ab + ad > ab + be, that is, if and only if ad > be. Thus 
the two statements pi > p 2 and p 3 > Pi are in fact equivalent state¬ 
ments, so that the argument is valid. 

EXERCISES 

1. In Example 2 find: 

(a) The number of students that take mathematics but do not take 

history. [ Ans . 12 .] 

(b) The number that take exactly two of the three courses. 

(c) The number that take one or none of the courses. 



sne sne 

Figure 5 
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2. In a chemistry class there are 20 students, and in a psychology class 
there are 30 students. Find the number in either the psychology class or the 
chemistry class if: 

(a) The two classes meet at the same hour. [Ans. 50.] 

(b) The two classes meet at different hours and 10 students are en¬ 
rolled in both courses. [Arts. 40.] 

3. If the truth set of a statement p has 10 elements, and the truth set 
of a statement q has 20 elements, find the number of elements in the truth 
set of p V q if: 

(a) p and q are inconsistent. 

(b) p and q are consistent and there are 2 elements in the truth set 
of p A q. 

4. If p is a statement that is true in ten cases, and q is a statement that 
is true in five cases, find the number of cases that both p and q are true if 
p V q is true in ten cases. What relation holds between p and q ? 

5. Assume that the incidence of lung cancer is 15 per 100,000, and 

that it is estimated that 75 per cent of those with lung cancer smoke and 60 
per cent of those without lung cancer smoke. (These numbers are fictitious.) 
Estimate the fraction of smokers with lung cancer, and the fraction of non- 
smokers with lung cancer. [ Ans . 18.75 and 9.375 per 100,000.] 

6 . Let A y By and C be any three sets of a universal set *11. Draw a Venn 
diagram and show that 

n(A\J B\J C) = n(A) + n(B) + n(C) - n(A P B) 

- n{B PC) - n{A P C) + n(A P B P C). 


7 . Analyze the data given below and draw a Venn diagram like that in 
Figure 4. Assuming that every student in the school takes one of the courses, 


number of students in the school. 

(a) 

(b) 


28 

36 

students take English. 

23 

23 

students take French. 

23 

13 

students take German. 

12 

6 

students take English and French. 

11 

11 

students take English and German. 

8 

4 

students take French and German. 

5 

1 

students take all three courses. 


Comment on the result in (b). 
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8. Suppose that in a survey concerning the reading habits of students 
it is found that: 

60 per cent read magazine A. 

50 per cent read magazine B. 

50 per cent read magazine C. 

30 per cent read magazines A and B. 

20 per cent read magazines B and C. 

30 per cent read magazines A and C. 

10 per cent read all three magazines. 

(a) What per cent read exactly two magazines? [Arts. 50.] 

(b) What per cent do not read any of the magazines? [Ans. 10.] 

9. If p and q are equivalent statements and n(P) = 10, what is n(P VJ Q)? 

10. If p implies q, prove that n(P Q) — n(P) + n(Q). 

11. On a transcontinental airliner, there are 9 boys, 5 American children, 

9 men, 7 foreign boys, 14 Americans, 6 American males, and 7 foreign females. 
What is the number of people on the plane? [Ans. 33.] 


4. PERMUTATIONS 


We wish to consider here the number of ways in which a group of n 
different objects can be arranged. An arrangement of n different ob¬ 
jects in a given order is called a permutation of the n objects. We 

consider first the case of three ob¬ 
jects, a, b, and c. We can exhibit 
all possible permutations of these 
three objects as paths of a tree, 
as shown in Figure 6. Each path 
exhibits a possible permutation, 
and there are six such paths. We 
could also list these permutations 
as follows: 

abc, bca, 

acb, cab, 

bac, cba. 



Figure 6 


If we were to construct a similar tree for n objects, ive would find 
that the number of paths could be found by multiplying together 
the numbers n, n — 1, n — 2, continuing down to the number 1. 
The number obtained in this way occurs so often that we give it 
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a symbol, namely n\, which is read “n factorial.” Thus, for example, 
3! = 3 21 = 6, 4! = 4-3-2-1 = 24, etc. For reasons which will be 
clear later, we define 0! = 1. Thus we can say there are n\ different 
'permutations of n distinct objects. 

Example 1. In the game of Scrabble, suppose there are seven 
lettered blocks from which we try to form a seven-letter word. If the 
seven letters are all different, we must consider 7! = 5040 different 
orders. 

Example 2. A quarterback has a sequence of ten plays. Suppose 
his coach instructs him to run through the ten-play sequence without 
repetition. How much freedom is left to the quarterback? He may 
choose any one of 10! = 3,628,800 orders in which to call the plays. 

Example 3. How many ways can n people be seated around a 
circular table? When this question is asked, it is usually understood 
that two arrangements are different only if at least one person has a 
different person next to him in the two arrangements. Consider then 
one person in a fixed position. There are in — 1)! ways in which the 
other people may be seated. We have now counted all the arrange¬ 
ments we wish to consider different. Why? 

A general principle. There are many counting problems for which 
it is not possible to give a simple formula for the number of possible 
cases. In many of these the only way to find the number of cases is 
to draw a tree and count them (see Exercise 4). In some problems, 
the following general principle is useful. 

If one thing can be done in exactly r different ways , for each of these a 
second thing can be done in exactly s different ways , for each of the first 
two , a third can be done in exactly t ways , etc,, then the sequence of things 
can be done in the product of the numbers of ways in which the individual 
things can be done , i.e,, r-s-t. , . . 


The validity of the above general principle can be established by 
thinking of a tree representing all the ways in which the sequence of 
things can be done. There would be r branches from the starting 
position. From the ends of each of these r branches there would be 
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s new branches, and from each of these t new branches, etc. The 
number of paths through the tree would be given by the product 
r-s-t. . .. 

Example 4. The number of permutations of n distinct objects is 
a special case of this principle. If we were to list all the possible per¬ 
mutations, there would be n possibilities for the first, for each of these 
n — 1 for the second, etc., until we came to the last object, and for 
which there is only one possibility. Thus there are n(n — 1)... 1 = n\ 
possibilities in all. 

Example 5. If there are three roads from city x to city y and 
two roads from city y to city z, then there are 3*2 = 6 ways that a 
person can drive from city x to city z passing through city y. 

Example 6. Suppose there are n applicants for a certain job. Three 
interviewers are asked independently to rank the applicants according 
to their suitability for the job. It is decided that an applicant will 
be hired if he is ranked first by at least two of the three interviewers. 
What fraction of the possible reports would lead to the acceptance 
of some candidate? We shall solve this problem by finding the frac¬ 
tion of the reports which do not lead to an acceptance and subtract 
this answer from 1. Frequently an indirect attack of this kind on a 
problem is easier than the direct approach. The total number of re¬ 
ports possible is {nX) z since each interviewer can rank the men in n\ 
different ways. If a particular report does not lead to the acceptance 
of a candidate, it must be true that each interviewer has put a differ¬ 
ent man in first place. This can be done in n(n l)(w 2) different 
ways by our general principle. For each possible first choices, there 
are [(n — l)!] 3 ways in which the remaining men can be ranked by 
the interviewers. Thus the number of reports which do not lead to 
acceptance is n(n — 1 )(n — 2)[(n — 1) !] 3 . Dividing this number by 
(n!) 3 we obtain 

(n — 1 )(n — 2) 
n 2 

as the fraction of reports which fail to accept a candidate. The frac¬ 
tion which leads to acceptance is found by subtracting this fraction 
from 1 which gives 

3 n — 2 
n 2 
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For the case of three applicants, we see that £ of the possibilities lead 
to acceptance. Here the procedure might be criticized on the grounds 
that even if the interviewers are completely ineffective and are es¬ 
sentially guessing there is a good chance that a candidate will be 
accepted on the basis of the reports. For n equal to ten, the fraction 
of acceptances is only .28, so that it is possible to attach more signifi¬ 
cance to the interviewers ratings, if they reach a decision. 

EXERCISES 

1. In how many ways can five people be lined up in a row for a group 

picture? In how many ways if it is desired to have three in the front row and 
two in the back row? [Ans. 120; 120.] 

2. Assuming that a baseball team is determined by the players and the 
position each is playing, how many teams can be made from 13 players if; 

(a) Each player can play any position? 

(b) Two of the players can be used only as pitchers? 

3. Grades of A, B, C, D, or E are assigned to a class of five students. 

(a) How many ways may this be done, if no two students receive 

the same grade? [Ans. 120.] 

(b) Two of the students are named Smith and Jones. How many 
ways can grades be assigned if no two students receive the same 
grade and Smith must receive a higher grade than Jones? 

[Ans. 60.] 

(c) How many ways may grades be assigned if only grades of A and E 

are assigned? [Ans. 32.] 

4. A certain club wishes to admit seven new members, four of whom are 
Republicans and three of whom are Democrats. Suppose the club wishes to 
admit them one at a time and in such a way that there are always more 
Republicans among the new members than there are Democrats. Draw a 
tree to represent all possible ways in which new members can be admitted, 
distinguishing members by their party only. 

5. There are three different routes connecting city A to city B. How 
many ways can a round trip be made from A to B and back? How many 
ways if it is desired to take a different route on the way back? [Ans. 9; 6.] 

6. How many different ways can a ten-question multiple-choice exam be 
answered if each question has three possibilities, a, b, and c? How many if 
no two consecutive answers are the same? 

7. Modify Example 6 so that, to be accepted, an applicant must be first 
in two of the interviewers’ rating and must be either first or second in the 
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third interviewers’ rating. What fraction of the possible reports lead to 
acceptance in the case of three applicants? In the case of n? 

[Arts, i; 4/n 2 .] 

8. A town has 1240 registered Republicans. It is desired to contact each 
of these by phone to announce a meeting. A committee of r people devise a 
method of phoning s people each and asking each of these to call t new people. 
If the method is such that no person is called twice, 

(a) How many people know about the meeting after the phoning? 

(b) If the committee has 40 members and it is desired that all 1240 
Republicans be informed of the meeting and that s and t should be 
the same, what should they be? 

9. In the Scrabble example, suppose the letters are Q, Q, U, F, F, F, A. 
How many distinguishable arrangements are there for these seven letters? 

[Arts. 420.] 

10. How many different necklaces can be made 

(a) If seven different sized beads are available? [Ans. 360.] 

(b) If six of the beads are the same size and one is larger? [Ans. 1.] 

(c) If the beads are of two sizes, five of the smaller size and two of the 

larger size? [Ans. 3.] 

11 . Prove that two people in Columbus, Ohio, have the same initials. 

12. Find the number of arrangements of the five symbols that can be 
distinguished. (The same letters with different subscripts indicate distin¬ 
guishable objects.) 

(a) Ai, A* Bi, B 2 , B 3 . l Ans - 120 -] 

(b) A, A, B,, B 2 , B 3 . [A™. 60.] 

(c) A, A, B, B, B. [Ans. 10.] 

13. Show that the number of distinguishable arrangements possible for 
n objects, m of type 1, n 2 of type 2, etc., for r different types is 

_n!_ _ 

niM . . ., ft r ! 


5. COUNTING PARTITIONS 

Up to now we have not had occasion to consider the partitions 
[(1, 2), (3, 4)] and [(3, 4), (1, 2)] of the integers from 1 to 4 as being 
different partitions. Here it will be convenient to do so, and to indi¬ 
cate this distinction we shall use the term ordered partition. An or¬ 
dered partition with r cells is a partition with r cells (some of which 
may be empty), with a particular order specified for the cells. 
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We are interested in counting the number of possible ordered parti¬ 
tions with r cells that can be formed from a set of n objects having a 
prescribed number of elements in each cell. We consider first a special 
case to illustrate the general procedure. 

Suppose that we have eight students, A, B, C, D, E, F, G, and H, 
and we wish to assign these to three rooms, Room 1 which is a triple 
room, Room 2, a triple room, and Room 3, a double room. In how 
many different ways can the assignment be made? One way to assign 
the students is to put them in the rooms in the order in which they 
arrive, putting the first three in Room 1, the next three in Room 2, 
and the last two in Room 3. There are 8! ways in which the students 
can arrive, but not all of these lead to different assignments. We can 
represent the assignment corresponding to a particular order of ar¬ 
rival as follows, 

|BCA|DFE|HG|. 


In this case, B, C, and A are assigned to Room 1, D, F, and E to 
Room 2, and H and G to Room 3. Notice that orders of arrival 
which simply change the order within the rooms lead to the same 
assignment. The number of different orders of arrival which lead to 
the same assignment as the one above is the number of arrangements 
which differ from the given one only in that the arrangement within 
the rooms is different. There are 3!-3!*2! such orders of arrival, since 
we can arrange the three in Room 1 in 3! different ways, for each of 
these the ones in Room 2 in 3! different ways, and for each of these, 
the ones in Room 3 in 2! ways. Thus we can divide the 8! different 
orders of arrival into groups of 3!-3!*2! different orders such that all 
the orders of arrival in a single group lead to the same room assign¬ 
ment. Since there are 3! • 3! • 2! elements in each group and 8! elements 


altogether, there are 


8 ! 

3!3!2! 


groups, or this many different room as¬ 


signments. 

The same argument could be carried out for n elements and r rooms 
with ri\ in the first, n 2 in the second, etc. This would lead to the fol¬ 
lowing result. Let n i, n 2 , . . ., n r be nonnegative integers with 
U\ u% u r ^ n. Then, 


The number of ordered 'partitions with r cells [A h A 2 , A 3 , . . ., A r ] 
of a set of n elements with n\ in the first cell , n 2 in the second, etc . is 
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rhinal . . . n r ! 

We shall denote this number by the symbol 


c " )■ 

\n h n 2y . . . yfir/ 


The special case of two cells is particularly important. Here the 
problem can be stated equivalently as the problem of finding the 
number of subsets with r elements that can be chosen from a set of 
n elements. This is true because any choice defines a partition [A, A], 
where A is the set of elements chosen and A is the set of remaining 

n\ 

elements. The number of such partitions is ] and hence this is 


also the number of subsets with r elements. Our notation 


for this case is shortened to 


( ” ) 

\r,n—r) 


Notice that 


( n ) 

\n — r) 


is the number of subsets with n — r elements 


which can be chosen from n, which is the number of partitions of the 
form [A, A] above. Clearly, this is the same as the number of 


[A y A] partitions. Hence 




Example 1* A college has scheduled six football games during a 
season. How many ways can the season end in two wins, three losses, 
and one tie? From each possible outcome of the season, we form a 
partition with three cells of the opposing teams. In the first cell we 
put the teams which our college defeats, in the second the teams to 
which our college loses, and in the third cell the teams which our 

college ties. There are 3 l) = 60 SUch P artitions > and h ence 60 

ways in which the season can end with two wins, three losses, and 
one tie. 

Example 2. In the game of bridge the hands N, E, S, and W deter¬ 
mine a partition of the 52 cards having four cells each with thirteen 




Sec. 5] 


PARTITIONS AND COUNTING 


99 


elements. Thus there are !i 3 fi 3 | different bridge deals. This 

number is about 5.3645* 10 28 or approximately 54 billion billion billion 
deals. 


Example 3. The following example will be important in probabil¬ 
ity theory, which we take up in the next chapter. If a coin is thrown 
six times, there are 2 6 possibilities for the outcome of the six throws, 
since each throw can result in either a head or a tail. How many of 
these possibilities result in four heads and two tails? Each sequence 
of six heads and tails determines a two-cell partition of the numbers 
from one to six as follows: in the first cell put the numbers correspond¬ 
ing to throws which resulted in a head, and in the second put the 
numbers corresponding to throws which resulted in tails. We require 
that the first cell should contain four elements and the second two 
elements. Hence the number of the 2 6 possibilities which lead to four 
heads and two tails is the number of two-cell partitions of six elements 
which have four elements in the first cell and two in the second cell. 

The answer is = 15. For n throws of a coin, a similar analysis 


shows that there are different sequences of H's and T's of length 
n which have exactly r heads and n — r tails. 


EXERCISES 


1. Compute the following numbers. 


(a) 

( 5 ) *- Ans ‘ 21, i 


(e) 

© 

(b) 

(!) 


(f) 

(1,2,2) 

(c) 

G) 


(g) 

( 2 , 0 , 2 ) [a “- 6j 

(d) 

(S) [a “- 25oj 


(h) 

(i,u) 

2. Give an interpretation for 

G) 

and 

also for Can you now give 


a reason for making 0! = 1? 
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3. How many ways can nine students be assigned to three triple rooms? 
How may ways if one particular pair of students refuse to room together? 

[Ans. 1680; 1260.] 

4. A group of seven boys and ten girls attends a dance. If all the boys 
dance in a particular dance, how many possibilities are there for the girls 
who dance? For the girls who do not dance? How many possibilities are 
there for the girls who do not dance, if three of the girls are sure to be asked 
to dance? 

5. Suppose that a course is given at three different hours. If fifteen 
students sign up for the course 

(a) How many possibilities are there for the ways the students could 

distribute themselves in the classes? [Ans. 3 15 .] 

(b) How many of the ways would give the same number of students 

in each class? [Ans. 756,756.] 

6. A college professor anticipates teaching the same course for the next 
35 years. So not to become bored with his jokes, he decides to tell exactly 
three jokes every year and in no two years to tell exactly the same three 
jokes. What is the minimum number of jokes that will accomplish this? 
What is the minimum number if he determines never to tell the same joke 
twice? 

7. How many ways can you answer a ten-question true-false exam, mark¬ 

ing the same number of answers true as you do false? How many if it is 
desired to have no two consecutive answers the same? • 

8. From three Republicans and three Democrats, find the number of 


committees of three which can be formed, 

(a) With no restrictions. [Ans. 20.] 

(b) With three Republicans and no Democrats. [Ans. 1.] 

(c) With two Republicans and one Democrat. [Ans. 9.] 

(d) With one Republican and two Democrats. [Ans. 9.] 

(e) With no Republicans and three Democrats. [Ans. 1.] 


What is the relation between your answer in part (a) and the answers to the 
remaining four parts? 


9. Problem 8 suggests that the following should be true. 



Show that it is true. 
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10 . A student needs to choose two electives from six possible courses. 

(a) How many ways can he make his choice? [Ans. 15.] 

(b) How many ways can he choose if two of the courses meet at the 

same time? [Arts, 14.] 

(c) How many ways can he choose if two of the courses meet at 10 

o’clock, two at 11 o’clock, and there are no other conflicts among 
the courses? [Ans. 13.] 


6. SOME PROPERTIES OF THE NUMBERS 



The numbers 



introduced in the previous section will play an 


important role in our future work. We give here some of the more 
important properties of these numbers. 

A convenient way to obtain these numbers is given by the famous 
Pascal triangle, shown in Figure 7. To obtain the triangle we first 



• • • 


• • • • 

Figure 7 


write the l’s down the sides. Any of the other numbers in the triangle 
has the property that it is the sum of the two adjacent numbers in 
the row just above. Thus the next row in the triangle is 1, 6, 15, 20, 

15, 6, 1. To find the number we look in the row corresponding to* 

the number n and see where the diagonal line corresponding to the 
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value of j intersects this row. For example, = h is in the row 
marked n = 4 and on the diagonal marked j = 2. 

The property of the numbers (^j upon which the triangle is 


upon which the triangle is 


based is 


erK-O+o 


This fact can be verified directly (see Exercise 6), but the following 


argument is interesting in itself. The number 


CT) 


is the number 


of subsets with j elements that can be formed from a set of n + 1 


elements. Select one of the n + 1 elements, x. The 


(T) 


subsets 


can be partitioned into those that contain x, and those that do not. 
The latter are subsets of j elements formed from n objects, and hence 


there are 


such subsets. The former are constructed by adding x 


to a subset of j — 1 elements formed from n elements, and hence 
there are c -.) of them. Thus 

CTMaM”) 


If we look again at the Pascal triangle, we observe that the num¬ 
bers in a given row increase for a while, and then decrease. We can 
prove this fact in general by considering the ratio of two successive 
terms, 


^ (j+mn-j- 1)! n\ j + l 

The numbers increase as long as the ratio is greater than 1, i.e., 
n — j > j + 1. This means that j < — 1). We must distinguish 

the case of an even n from an odd n . For example, if n = 10, j must 
be less than |(10 — 1) = 4.5. Hence for j up to 4 the terms are in¬ 
creasing, from j = 5 on the terms decrease. For n = 11, j must be 
less than ^(11 — 1) = 5. Forj = 5, (n — j)/(j + 1) = 1. Hence, up 
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to j — 5 the terms increase, then 



decrease. 



and then the terms 


EXERCISES 

1. Extend the Pascal triangle to n = 16. Save the result for later use. 

2. Prove that 



using the fact that a set with n elements has 2 n subsets. 

3. For a set of ten elements prove that there are more subsets with five 
elements than there are subsets with any other fixed number of elements. 

4. Using the fact that ^ ^ ~ compute f° r s = 

1, 2, 3, 4 from the fact that = b [Arts. 30; 435; 4060; 27,405.] 


different possible bridge hands. Assume that a list is 

made showing all these hands, and that in this list the first card in every hand 
is crossed out. This leaves us with a list of twelve-card hands. Prove that 
at least two hands in the latter list contain exactly the same cards. 

6. Prove that 

CTMaM;) 

using only the fact that 

n\ _ nl 

j) jKn-j)\ 

1, Construct a triangle in the same way that the Pascal triangle was 
constructed, except that whenever you add two numbers, use the addition 
table in Chapter II, Figure 11(a). Construct the triangle for 16 rows. What 
does this triangle tell you about the numbers in the Pascal triangle? Use 
this result to check your triangle in Exercise 1. 

8. In the triangle obtained in Exercise 7, what property do the rows 1, 2, 
4, 8, and 16 have in common? What does this say about the numbers in the 
corresponding rows of the Pascal triangle? What would you predict for the 
terms in the 32nd row of the Pascal triangle? 


5. There are 


s 
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9. For the following table state how one row is obtained from the pre¬ 
ceding row and give the relation of this table to the Pascal triangle. 


1 

1 

1 

1 

1 

1 

1 

1 

2 

3 

4 

5 

6 

7 

1 

3 

6 

10 

15 

21 

28 

1 

4 

10 

20 

35 

56 

84 

1 

5 

15 

35 

70 

126 

210 

1 

6 

21 

56 

126 

252 

462 

1 

7 

28 

84 

210 

462 

924 


10. Referring to the table in Exercise 9, number the columns starting with 
0, 1, 2, . . . and number the rows starting with 1, 2, 3, ... . Let/(n,r) be the 
element in the nth column and the rth row. The table was constructed by the 
rule 

f(n,r) = f(n—l,r) + f(n,r- 1) 

for n > 0 and r > 1, and f(n,l) = /(0,r) = 1 for all n and r. Verify that 

(” + r‘) 

satisfies these conditions and is in fact the only choice for f(n,r) which will 
satisfy the conditions. 

11. Consider a set {a, a, a} of three objects which cannot be distinguished 
from one another. Then the ordered partitions with two cells which could 
be distinguished are 

[{a, a, a}, 8] 

[{a, a}, {a}] 

[{a}, {a, a}] 

[8, {a, a, a}]. 

List all such ordered partitions with three cells. How many are there? 

[. Ans . 10 .] 

12. Let f(n,r) be the number of distinguishable ordered partitions with r 
cells which can be formed from a set of n indistinguishable objects. Show 
that f(n,r) satisfies the conditions 

f(n,r) = f(n—l,r) +/(n,r-1) 

for n > 0 and r > 1, and/(n,l) = f(0,r) = 1 for all n and r. 

(Hint: Show that f(n,r —1) is the number of partitions which have the last 
cell empty and f(n—l,r) is the number which have at least one element in 
the last cell.) 

13. Using the results of Exercises 10 and 12 show that the number of 
distinguishable ordered partitions with r cells which can be formed from a 
set of n indistinguishable objects is 



Sec. 7] 


PARTITIONS AND COUNTING 


105 


14. Assume that a mailman has seven letters to put in three mail boxes. 
How many ways can this be done if the letters are not distinguished? 

[Arts. 36.] 

15. By an ordered partition with r elements of n we mean a sequence of 
nonnegative integers, possibly some 0, written in a definite order, and having 
sum n . For example, {1, 0, 3} and {3, 0, 1} are two different ordered parti¬ 
tions with 3 elements of 4. Show that the number of ordered partitions with 
r elements of n is 

(” + r‘> 


7. BINOMIAL AND MULTINOMIAL THEOREMS 


It is sometimes necessary to expand products of the form (x + y ) 3 , 
(x + 2y + II 2 ) 5 , etc. In this section we shall consider systematic 
ways of carrying out such expansions. 

Consider first the special case (x + y ) z . We write this as 

(x + y) z = (x + y)(x + y)(x + y). 


To perform the multiplication, we choose either an # or a y from each 
of the three factors and multiply our choices together; we do this for 
all possible choices and add the results. We represent a particular set 
of choices by a two-cell partition of the numbers 1, 2, 3. In the first 
cell we put the numbers which correspond to factors from which we 
chose an x . In the second cell we put the numbers which correspond 
to factors from which we chose a y . For example, the partitions 
[{1,3}, {2}] correspond to a choice of x from the first and third 
factors and y from the second. The product so obtained is xyx = x 2 y . 
The coefficient of x 2 y in the expansion of (x + y) z will be the number 
of partitions which lead to a choice of two x’s and one y. That is, 


the number of two-cell partitions of three elements with two elements 
in the first cell and one in the second, which is = 3. More gen¬ 


erally the coefficient of the term of the form x J 'y z ~ J ' will be 
j = 0, 1, 2, 3. Thus we can write the desired expansion as 


for 
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(x + yf = (f)x 3 + (^jxhj + (^)xy* + (^)y 3 

— x 3 + 3 x 2 y + 3 xy 2 + y 3 . 

The same argument carried out for the expansion (x + y) n leads 
to the binomial theorem of algebra. 

Binomial theorem. The expansion of (x + y) n is given by 
{x + y) n = x n + (^ n _ ^x"-'y + (^ ” 2 ^a; n -V 

+ • • • + ( j )^”- 1 + y n - 

Example 1. Let us find the expansion for (a — 2 b) z . To fit this 
into the binomial theorem, we think of x as being a and y as being 
— 2b. Then we have 

(a - 2 b) s = a 3 + 3a 2 (-26) + 3a(-26) 2 + (-2b) 3 

— a 3 — 6 a 2 b + 12a?> 2 — 8b 3 . 


We turn now to the problem of expanding the trinomial 
(x + y + z) 3 . Again we write 

(x + y + z) 3 = (x + y + z)(x + y + z)(x + y + z). 


This time we choose either axixoryorz from each of the three factors. 
Our choice is now represented by a three-cell partition of the set of 
numbers {1, 2, 3}. The first cell has the numbers corresponding to 
factors from which we choose an x, the second cell the numbers cor¬ 
responding to factors from which we choose a y y and the third those 
from which we choose a z. For example, the partition [{1, 3}, 8, {2}] 
corresponds to a choice of x from the first and third factors, no y’ s, 
and a z from the second factor. The term obtained is xzx — x 2 z. The 
coefficient of the term x 2 z in the expansion is thus the number of 
three-cell partitions with two elements in the first cell, none in the 

second, and one in the third. There are U) = 3 such partitions. 

In general the coefficient of the term of the form x a y h z c in the expan- 
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( 3 \ 3! 

ab c) = alblcl * this way the 

coefficient for each possible a, b } and c we obtain 

{x + y + z) s = x z + y z + z 3 + 3x 2 y + 3 xy 2 

+ 3 yz 2 + 3 y 2 z + 3 xz 2 + 3 x 2 z + 6xyz. 


The same method can be carried out in general for finding the ex¬ 
pansion of (xi + x 2 + . . . + x r ) n . From each factor we choose either 
an Xi, or x 2 , or # 3 ,, or x n form the product and add these products 
for all possible n choices. We will have r n products, but many will 
be equal. A particular choice of one term from each factor determines 
an r-cell partition of the numbers from 1 to ft. In the first cell we put 
the numbers of the factors from which we choose an xi, in the second 
cell those from which we choose x 2 , etc. A particular choice gives 
us a term of the form x^ 1 x 2 * . . . x^ T with fti + ft 2 + . . . + n T = ft. 
The corresponding partition has fti elements in the first cell, n 2 in the 
second, etc. For each such partition we obtain one such term. Hence 
the number of these terms which we obtain is the number of such 
partitions, which is 

( ft \_ ri! 

ftl,ft2, • • . , ftr/ fti !ft2! . . . ft r ! 

Thus we have the multinomial theorem. 


Multinomial theorem. The expansion of (#i + x 2 + . . . + x T ) n 
is found by adding all terms of the form 




x 2 


where fti + n 2 + . . . + n r = n. 


EXERCISES 

1. Expand by the binomial theorem 

(a) (x + yY. 

(b) (1 + x)K 

(c) (x - y)\ 

(d) (2x + aY. 

(e) (2x — 3y) 3 . 

(f) (100 - l) 5 . 
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2. Expand 

(a) (x + y + 2) 4 . 

(b) (2 x + y - z)\ 

(c) (2 + 2 + l) 3 . (Evaluate two ways.) 

3. (a) Find the coefficient of the term x 2 y*z 2 in the expansion of 

(x + y + 2) 7 . [Ans. 210.] 

(b) Find the coefficient of the term x 6 y z z 2 in the expression 

(x - 2y + 5 z) 11 . [Ans. -924,000.] 

4. Using the binomial theorem, prove that 



5. Using an argument similar to the one in section 6, prove that 

(^Ha) + Ga*) + («.*-.> 

6. Let f(n,r) be the number of terms in the multinomial expansion of 


and show that 


(x\ + #2 + . . . + x r ) n , 
- (»+;->). 


(Hint: Show that the conditions of Section 6, Exercise 10 are satisfied by 
showing that /(n,r —1) is the number of terms which do not have x r and 
/(n —l,r) is the number which do. Alternately, use Exercise 15 of Section 6 
by showing that each term in the expansion determines an ordered sequence 
of r integers whose sum is n.) 


7. How many terms are there in each of the expansions: 

(a) (x + y + z) 6 ? 

(b) (a + 2b + 5c + d) 4 ? 

(c) (r + s -j -1 + u + v ) 6 ? 


[Ans. 28.] 
[Ans. 35.] 
[Ans. 210.] 


8 . 


Prove that k n is the sum of the numbers 



of n, r 2 , ..., r k such that n + t 2 + . . . + r* — n. 


,) 


for all choices 


*8. VOTING POWER 

We return to the problem raised in Section 6 of Chapter II. Now 
we are interested not only in coalitions, but also in the power of indi- 
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vidual members. We will develop a numerical measure of voting 
power that was suggested by L. S. Shapley and M. Shubik. While 
the measure will be explained in detail below, for the reasons for 
choosing this particular measure the reader is referred to the original 
paper. 

First of all we must realize that the number of votes a man controls 
is not in itself a good measure of his power. If x has three votes and 
y has one vote, it does not necessarily follow that x has three times 
the power that y has. Thus if the committee has just three members 
{x, y, z} and z also has only one vote, then x is a dictator and y is 
powerless. 

The basic idea of the power index is found in considering various 
alignments of the committee members on a number of issues. The n 
members are ordered xi, x 2 , . . . , x n according to how likely they are 
to vote for the measure. If the measure is to carry, we must persuade 
xi and x 2 up to X* to vote for it until we have a wanning coalition. 
If {xi, x 2 , . . ., X*} is a winning coalition but {xi, x 2 , . . ., x t _i} is not 
winning, then X* is the crucial member of the coalition. We must 
persuade him to vote for the measure, and he is the one hardest to 
persuade of the i necessary members. We call x* the 'pivot 

For a purely mathematical measure of the power of a member we 
do not consider the views of the members. Rather we consider all 
possible ways that the members could be aligned on an issue, and see 
how often a given member would be the pivot. That means consider¬ 
ing all permutations, and there will be n! of them. In each permuta¬ 
tion one member will be the pivot. The frequency with which a man 
is the pivot of an alignment is a good measure of his voting power. 

Definition. The voting power of a member of a committee is the 
number of alignments in which he is pivotal divided by the total 
number of alignments. (The total number of alignments, of course, 
is n!, for a committee of n members.) 

Example 1. If all n members have one vote each, and it takes a 
majority vote to carry a measure, it is easy to see (by symmetry) that 
each member is pivot in 1/n of the alignments. Hence each member 
has power = 1/n. Let us illustrate this for n — 3. There are 3! = 6 
alignments. It takes two votes to carry a measure; hence the second 
member is always the pivot. The alignments are: 123, 132, 213, 231, 
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312, 321. The pivots are in boldface. Each member is pivot twice, 
hence has power f = i- 


Example 2, Reconsider Chapter II, Section 6, Example 3 from 
this point of view. There are 24 permutations of the four members. 
We will list them, with the pivot in boldface: 


wxyz 

wxzy 

wyxz 

wyzx 

wzxy 

wzyx 

xwyz 

xwzy 

xywz 

xyzw 

xzwy 

xzyw 

yxwz 

yxzw 

ywxz 

ywzx 

yzxw 

yzwx 

zxyw 

zxwy 

zyxw 

zywx 

zwxy 

zwyx 


We see that z has power of H, w has ^ e T , x and y have each. (Or, 
simplified, they have ^ power, respectively.) We note 

that these ratios are much further apart than the ratio of votes which 
is 3:2:1:1. Here three votes are worth seven times as much as the 
single vote and more than twice as much as two votes. 


Example 3. Reconsider Chapter II, Section 6, Example 4. By an 
analysis similar to the ones used so far (but too long to be included 
here) it can be shown that in the Security Council of the United 
Nations each of the Big Five has or approximately 0.197 power, 
while each of the small nations has approximately 0.002 power. This 
reproduces our intuitive feeling that, while the small nations in the 
Security Council are not powerless, nearly all the power is in the hands 
of the Big Five. 


Example 4. In a committee of five each member has one vote, but 
the chairman has veto power. Hence the minimal winning coalitions 
are three-member coalitions including the chairman. There are 
5! = 120 permutations. The pivot cannot come before the chairman, 
since without the chairman we do not have a winning coalition. 
Hence, when the chairman is in place number 3, 4, or 5, he is the 
pivot. This happens in f of the permutations. When he is in position 
1 or 2, then the number 3 man is pivot. The number of permutations 
in which the chairman is in one of the first two positions and a given 
man is third is 2*3! = 12. Hence the chairman has power f, and each 
of the others has power 
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EXERCISES 

1. A committee of three makes decisions by majority vote. Write out all 
permutations, and calculate the voting powers if the three members have: 

(a) One vote each. [Arts. J, J, J.] 

(b) One vote for two of them, two votes for the third. [ Ans . i, £, f.] 

(c) One vote for two of them, three votes for the third. [Ans. 0, 0, 1.] 

(d) One, two, and three votes, respectively. [Ans. J, J, f.] 

(e) Two votes each for two of them, and three votes for the third. 

[Ans. J, ^.] 

2. Prove that in any decision-making body the sum of the powers of the 
members is 1. 

3. What is the power of a dictator? What is the power of a “powerless” 
member? Prove that your answers are correct. 

4. A large company issued 100,000 shares. These are held by three stock¬ 

holders, who have 50,000, 49,999, and 1 share, respectively. Calculate the 
powers of the three members. [Ans. }, J, J.] 

5. A committee consists of 100 members having one vote each, plus a 
chairman who can break ties. Calculate the power distribution. (Do not try 
to write out all permutations!) 

6. In Exercise 5, give the chairman a veto instead of the power to break 
ties. How does this change the power distribution? 

[Ans. The chairman has power iW.] 

7. How are the powers in Exercise 1 changed if the committee requires 
a f vote to carry a measure? 

8. If in a committee of five, requiring majority decisions, each member 
has one vote, then each has power Now let us suppose that two members 
team up, and always vote the same way. Does this increase their power? 
(The best way to represent this situation is by allowing only those permuta¬ 
tions in which these two members are next to each other.) 

[Ans. Yes, the pair’s power increases from .4 to .5.] 

9. Given the votes that each member of a decision-making body controls, 
show that the minimal winning coalitions can be determined. If the minimal 
winning coalitions are known, show that the power of each member can be 
determined without knowing anything about the number of votes that each 
member controls. 

10 . Answer the following questions for a three-man committee: 

(a) Find all possible sets of minimal winning coalitions. 



112 


PARTITIONS AND COUNTING 


[Chap. Ill 


(b) For each set of minimal winning coalitions find the distribution of 
voting power. 

(c) Verify that the various distributions of power found in Exercises 1 
and 7 are the only ones possible. 

11. In Exercise 1, parts (a) and (e) have the same answer, and parts (b) 
and (d) and Exercise 4 also have the same answer. Use the results of Exercise 
9 to find a reason for these coincidences. 


12. Compute the voting power of one of the Big Five in the Security 
Council of the United Nations as follows: 

(a) Show that for the nation to be pivotal it must be in the number 7 
spot or later. 


(b) Show that there are 



6! 2! permutations in which the nation is 


in the number 7 spot. 

(c) Find similar formulas for the number of permutations in which 
it is in the number 8, 9, 10, or 11 spot. 

(d) Use this information to find the total number of permutations in 
which it is pivotal, and from this compute the power of the nation. 
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Chapter IV 


PROBABILITY THEORY 


1. INTRODUCTION 

We often hear statements of the following kind, “It is likely to rain 
today,” “I have a fair chance of passing this course,” “There is an 
even chance that a coin will come up heads,” etc. In each case our 
statement refers to a situation in which we are not certain of the 
outcome, but we express some degree of confidence that our predic¬ 
tion will be verified. The theory of probability provides a mathe¬ 
matical framework for such assertions. 

Consider an experiment whose outcome is not known. Suppose that 
someone makes an assertion p about the outcome of the experiment, 
and we want to assign a probability to p. When statement p is con¬ 
sidered in isolation, we usually find no natural assignment of prob¬ 
abilities. Rather, we look for a method of assigning probabilities to 
all conceivable statements concerning the outcome of the experiment. 
At first this might seem to be a hopeless task, since there is no end 
to the statements we can make about the experiment. However we 
are aided by a basic principle: 

Fundamental assumption. Any two equivalent statements will be 
assigned the same probability. 

As long as there are a finite number of logical possibilities, there are 
only a finite number of truth sets, and hence the process of assigning 
probabilities is a finite one. We proceed in three steps: (1) we first 
determine % the possibility set, that is, the set of all logical possibili- 
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ties, (2) to each subset X of at we assign a number called the measure 
m(X), (3) to each statement p we assign ra(P), the measure of its 
truth set, as a probability. The probability of statement p is denoted 
by Pr [p]. 

The first step, that of determining the set of logical possibilities, 
is one that we considered in the previous chapters. It is important 
to recall that there is no unique method for analyzing logical pos¬ 
sibilities. In a given problem we may arrive at a very fine or a very 
rough analysis of possibilities, causing Tl to have many or few ele¬ 
ments. 

Having chosen at, the next step is to assign a number to each subset 
X of <11, which will in turn be taken to be the probability of any state¬ 
ment having truth set X. We do this in the following way. 

Assignment of a measure . Assign a positive number (weight) to 
each element of % so that the sum of the weights assigned is 1. Then 
the measure of a set is the sum of the weights of its elements. The 
measure of the set £ is 0. 

In applications of probability to scientific problems, the assignment 
of measures and the analysis of the logical possibilities may depend 
upon factual information and hence can best be done by the scientist 
making the application. 

Once the weights are assigned, to find the probability of a particular 
statement we must find its truth set and find the sum of the weights 
assigned to elements of the truth set. This problem, which might 
seem easy, can often involve considerable mathematical difficulty. 
The development of techniques to solve this kind of problem is the 
main task of probability theory. 

Example 1. An ordinary die is thrown. What is the probability 
that the number which turns up is less than 4? Here the possibility 
set is OL = {1, 2, 3, 4, 5, 6}. The symmetry of the die suggests that 
each face should have the same probability of turning up. To make 
this so we assign weight | to each of the outcomes. The truth set of 
the statement, “The number which turns up is less than 4,” is 
{1, 2, 3}. Hence the probability of this statement is f = J, the sum 
of the weights of the elements in its truth set. 

Example 2. A man attends a race involving three horses A, B, 
and C. He feels that A and B have the same chance of winning but 
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that A (and hence also B) is twice as likely to win as C is. What is 
the probability that A or C wins? We take as Tl the set {A, B, C}. 
If we were to assign weight a to the outcome C, then we would assign 
weight 2 a to each of the outcomes A and B. Since the sum of the 
weights must be 1, we have 2a + 2 a + a = 1, or a = Hence we 
assign weights §■, f, \ to the outcomes A, B, and C, respectively. 
The truth set of the statement “Horse A or C wins” is {A, C}. The 
sum of the weights of the elements of this set is I + b = f. Hence 
the probability that A or C wins is §. 

EXERCISES 

1. Assume that there are n possibilities for the outcome of a given experi¬ 
ment. How should the weights be assigned if it is desired that all outcomes 
be assigned the same weight? 

2. Let 11 = (a, b, c}. Assign weights to the three elements so that no 
two have the same weight, and find the measures of the eight subsets of 11. 

3. In an election Jones has probability \ of winning, Smith has proba¬ 
bility and Black has probability f. 

(a) Construct 11. 

(b) Assign weights. 

(c) Find the measures of the eight subsets. 

(d) Give a pair of nonequivalent predictions which have the same 
probability. 

4. Give the possibility set 11, for each of the following experiments. 

(a) An election between candidates A and B is to take place. 

(b) A number between 1 and 5 is chosen at random. 

(c) A two-headed coin is thrown. 

(d) A student is asked for the day of the year on which his birthday 
falls. 

5. For which of the cases in Exercise 4 might it be appropriate to assign 
the same weight to each outcome? 

6. Suppose that the following probabilities have been assigned to the 
possible results of putting a penny in a certain defective peanut-vending 
machine: The probability that nothing comes out is The probability that 
either you get your money back or you get peanuts (but not both) is 

(a) What is the probability that you get your money back and also 

get peanuts? [Ans. i] 

(b) From the information given, is it possible to find the probability 

that you get peanuts? [Ans. No.] 
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7. A die is loaded in such a way that the probability of each face is pro¬ 

portional to the number of dots on that face. (For instance, a 6 is three times 
as probable as a 2.) What is the probability of getting an even number in 
one throw? [Ans. -f-.] 

8. If a coin is thrown three times, list the eight possibilities for the out¬ 
comes of the three successive throws. A typical outcome can be written 
(HTH). Determine a probability measure by assigning an equal weight to 
each outcome. Find the probabilities of the following statements: 

(r) The number of heads that occur is greater than the number of 

tails. [Ans. £.] 

(s) Exactly two heads occur. [Ans. f.] 

(t) The same side turns up on every throw. [Ans. *.] 

9. For the statements given in Exercise 8, which of the following equali¬ 
ties are true? 


(a) 

Pr [r V s] 

= Pr [r] + Pr[s] 

(b) 

Pr[s V t] 

= Pr[s] + Prp] 

(c) 

Pr[r V 

= Pr [r] + Pr[^r] 

(d) 

Pr [r V t] 

= Pr [r] + Pr [t] 


10. Which of the following pairs of statements (see Exercise 8) are incon¬ 
sistent? (Recall that two statements are inconsistent if their truth sets have 
no element in common.) 

(a) r, s (c) r, ~r 

(b) s, t (d) r, t [Ans. (b) and (c)J 

11 . State a theorem suggested by Exercises 9 and 10. 

2. PROPERTIES OF A PROBABILITY MEASURE 

Before studying special probability measures, we shall consider 
some general properties of such measures which are useful in compu¬ 
tations and in the general understanding of probability theory. 
Three basic properties of a probability measure are: 

(A) m(X) = 0 if and only if X = 8. 

(B) 0 < m(X) < 1 for any set X . 

(C) For two sets X and Y, 

m(X VJ Y) = m(X) + m(Y) 

if and only if X and Y are disjoint, i.e., have no elements 
in common. 

The proofs of properties (A) and (B) are left as an exercise (see 
Exercise 16). We shall prove (C). 
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We observe first that m(X) + m(Y) is the sum of the weights of 
the elements of X added to the sum of the weights of F. If X and F 
are disjoint, then the weight of every element of X VJ F is added 
once and only once, and hence m(X) + m(Y) — ra(X U F). 

Assume now that X and F are not disjoint. Here the weight of 
every element contained in both X and F, i.e., in X P F, is added 
twice in the sum m(X) + m(Y). Thus this sum is greater than 
m(X F) by an amount m(X P F). By (A) and (B), if X P F is 
not the empty set, then ra(X P F) > 0. Hence in this case we have 
m(X) + m(F) > m(X VJ F). Thus if X and F are not disjoint, the 
equality in (C) does not hold. Our proof shows that in general we 
have 

(C') For any two sets X and F, 

m(X U F) = m(X) + m(F) - m(X P F) 

Since the probabilities for statements are obtained directly from 
the probability measure m(X), any property of m(X) can be trans¬ 
lated into a property about the probability of statements. For ex¬ 
ample, the above properties become, when expressed in terms of 
statements: 

(a) Pr[p] = 0 if and only if p is logically false. 

(b) 0 < Pr[p] < 1 for any statement p. 

(c) The equality 

Pr[p V q] = Pr [p] + Prfa] 

holds if and only if p and q are inconsistent. 

(c') For any two statements p and q , 

Pr [p V q] = Pr[p] + Pr [q] - Pr [p A q]. 

Another property of a probability measure which is often useful 
in computation is 

(D) m(X) = 1 - m(X), 
or, in the language of statements, 

(d) Pr[^p] = 1 — Pr[p]. 

The proofs of (D) and (d) are left as an exercise (see Exercise 17). 

It is important to observe that our probability measure assigns 
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probability 0 only to statements which are logically false, i.e., which 
are false for every logical possibility. Hence, a prediction that such 
a statement will be true is certain to be wrong. Similarly a statement 
is assigned probability 1 only if it is true in every case, i.e., logically 
true. Thus the prediction that a statement of this type will be true 
is certain to be correct. (While these properties of a probability meas¬ 
ure seem quite natural, it is necessary, when dealing with infinite 
possibility sets, to weaken them slightly. We consider in this book 
only the finite possibility sets.) 

We shall now discuss the interpretation of probabilities that are 
not 0 or 1. We shall give only some intuitive ideas that are commonly 
held concerning probabilities. While these ideas can be made mathe¬ 
matically more precise, we offer them here only as a guide to intuitive 
thinking. 

Suppose that, relative to a given experiment, a statement has been 
assigned probability p. From this it is often inferred that if a sequence 
of such experiments is performed under identical conditions, the frac¬ 
tion of experiments which yield outcomes making the statement true 
would be approximately p. The mathematical version of this is the 
“law of large numbers” of probability theory (which will be treated 
in Section 10). In cases where there is no natural way to assign a 
probability measure, the probability of a statement is estimated ex¬ 
perimentally. A sequence of experiments is performed and the frac¬ 
tion of the experiments which make the statement true is taken as 
the approximate probability for the statement. 

A second and related interpretation of probabilities is concerned 
with betting. Suppose that a certain statement has been assigned 
probability p. We wish to offer a bet that the statement will in fact 
turn out to be true. We agree to give r dollars if the statement does 
not turn out to be true, provided that we receive s dollars if it does 
turn out to be true. What should r and s be to make the bet fair? 
If it were true that in a large number of such bets we would win 8 
a fraction p of the times and lose r a fraction 1 — p of the time, then 
our average winning per bet would be sp — r(l — p). To make the 
bet fair we should make this average winning 0. This will be the case 
if sp = r(l — p) or if r/s — p/( 1 — p). Notice that this determines 
only the ratio of r and $. Such a ratio, written r:$, is said to give odds 
for the bet. 
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Example. Assume that a probability of § has been assigned to a 
certain horse winning a race. Then the odds for a fair bet would be 
f :J. These odds could be equally well written as 3:1, 6:2 or 12:4, etc. 
A fair bet would be to agree to pay $3 if the horse loses and receive 
$1 if the horse wins. Another fair bet would be to pay $6 if the horse 
loses and win $2 if the horse wins. 

EXERCISES 

1. Let p and q be statements such that Pr[p A q] = h Pr[~p] = and 

Pr[g] = What is Pr[p V <?]? [Ans. hi-] 

2. Using the result of Exercise 1, find Pr[^p A ~q]* 

3. Let p and q be statements such that Pr[p] = J and Pr [q] = f. Are 

p and q consistent? [Ans. Yes.] 

4. Show that, if Pr[p] + Pr[<?] > 1, then p and q are consistent. 

5. A student is worried about his grades in English and Art. He estimates 
that the probability of passing English is .4, that he will pass at least one 
course with probability .6, but that he has only probability .1 of passing 
both courses. What is the probability that he will pass Art? [Ans. .3.] 

6. Given that a school has grades A, B, C, D, and F, and that a student 

has probability .9 of passing a course, and .6 of getting a grade lower than B, 
what is the probability that he will get a C or D? [Ans. J.] 

7. What odds should a person give on a bet that a six will turn up when 
a die is thrown? 

8. Referring to Example 2 of Section 1, what odds should the man be 
willing to give for a bet that either A or B will come in first? 

9. Prove that if the odds relative to a given statement are r:s, then the 
probability that the statement will be true is r/(r + s). 

10. Using the result of Exercise 9 and the definition of “odds,” show that 
if the odds are r:s that a statement is true, then the odds are s:r that it is false. 

11 . A man is willing to give 5:4 odds that the Dodgers will win the World 

Series. What must the probability of a Dodger victory be for this to be a fair 
bet? [Ans. f.] 

12. A man has found through long experience that if he washes his car 
it rains the next day 85 per cent of the time. What odds should he give that 
this will occur next time? 
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13. A man offers 1:3 odds that A will occur, 1:2 odds that B will occur. 

He knows that A and B cannot both occur. What odds should he give that 
A or B will occur? [Arts. 7:5.] 

14. A man offers 3:1 odds that A will occur, 2:1 odds that B will occur. 
He knows that A and B cannot both occur. What odds should he give that 
A or B will occur? 

15. Show from the definition of a probability measure that m(X) = 1 if 
and only if X = < U. 

16. Show from the definition of a probability measure that properties 
(A), (B) of the text are true. 

17. Prove property (D) of the text. Why does property (d) follow from 
this property? 

18. Prove that if R, S , and T are three sets that have no element in 
common, 

m(R KJ S VJ T) = m(R) + m(S) + m(T). 

19. If X and Y are two sets such that X is a subset of F, prove that 
m{X) < m(F). 

20. If p and q are two statements such that p implies q y prove that 
Pr [p] < Pr [q]. 

21. Suppose that you are given n statements and each has been assigned a 
probability less than or equal to r. Prove that the probability of the dis¬ 
junction of these statements is less than or equal to nr. 

22. The following is an alternative proof of property (C') of the text. 
Give a reason for each step. 

(a) xvjf = (inf)Ujin7)U(7ni). w 

(b) m(X U 7) = m(X P Y) + m(X P Y) + m(X P F). 

(c) m(X UF) = m(X) + m(F) - m(X P F). 

23. If X, F, and Z are any three sets, prove that, for any probability 
measure, 

m(X KJ Y VJ Z) = m(X) + m(F) + m(Z) — m(X P F) 

- m(F P Z) - m(X P Z) + m(X P F P Z). 

24. Translate the result of Exercise 23 into a result concerning three state¬ 
ments p , q, and r. 

25. A man offers to bet “dollars to doughnuts” that a certain event will 

take place. Assuming that a doughnut costs a nickel, what must the prob¬ 
ability of the event be for this to be a fair bet? [Ans. ff.] 
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3. THE EQUIPROBABLE MEASURE 

We have already seen several examples where it was natural to 
assign the same weight to all possibilities in determining the appro¬ 
priate probability measure. The probability measure determined in 
this manner is called the equiprobable measure . The measure of sets 
in the case of the equiprobable measure has a very simple form. In 
fact, if *U has n elements and if the equiprobable measure has been 
assigned, then for any set X, m(X) is r/n, where r is the number of 
elements in the set X. This is true since the weight of each element 
in X is 1/n, and hence the sum of the weights of elements of X is r/n. 

The particularly simple form of the equiprobable measure makes 
it easy to work with. In view of this it is important to observe that a 
particular choice for the set of possibilities in a given situation may 
lead to the equiprobable measure, while some other choice will not. 
For example, consider the case of two throws of an ordinary coin. 
Suppose that we are interested in statements about the number 
of heads which occur. If we take for the possibility set the set 
*11 = {HH, HT, TH, TT} then it is reasonable to assign the same 
weight to each outcome, and we are led to the equiprobable measure. 
If, on the other hand, we were to take as possible outcomes the set 
*11 = {no H, one H, two H}, it would not be natural to assign the same 
weight to each outcome, since one head can occur in two different 
ways, while each of the other possibilities can occur in only one way. 

Example 1. Suppose that we throw two ordinary dice. Each die 
can turn up a number from 1 to 6; hence there are 6*6 possibilities. 
We assign weight to each possibility. A prediction that is true in 
j cases will then have probability j/ 36. For example, “The sum of 
the dice is 5,” will be true if we get 1 + 4, 2 + 3, 3 + 2, or 4 + 1. 
Hence the probability that the sum of the dice is 5 is = + The 
sum can be 12 in only one way, 6 + 6. Hence the probability that 
the sum is 12 is inr* 

Example 2. Suppose that two cards are drawn successively from 
a deck of cards. What is the probability that both are hearts? There 
are 52 possibilities for the first card, and for each of these there are 
51 possibilities for the second. Hence there are 52*51 possibilities for 
the result of the two draws. We assign the equiprobable measure. 
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The statement “The two cards are hearts” is true in 13-12 of the 
52-51 possibilities. Hence the probability of this statement is 
13-12/52-51 = tr- 


Example 3. Assume that, on the basis of a predictive index ap¬ 
plied to students A, B, and C when entering college, it is predicted 
that after four years of college the scholastic record of A will be the 
highest, C the second highest, and B the lowest of the three. Suppose, 
in fact, that these predictions turn out to be exactly correct. If the 
predictive index has no merit at all and hence the predictions amount 
simply to guessing, what is the probability that such a prediction will 
be correct? There are 3! = 6 orders in which the men might finish. 
If the predictions were really just guessing, then we would assign an 
equal weight to each of the six outcomes. In this case the probability 
that a particular prediction is true is Since this probability is 
reasonably large, we would hesitate to conclude that the predictive 
index is in fact useful, on the basis of this one experiment. Suppose, 
on the other hand, it predicted the order of six men correctly. Then 
a similar analysis would show that, by guessing, the probability is 


1 _ 

6! 


= — that such a prediction would be correct. Hence, we might 

< 


conclude here that there is strong evidence that the index has some 

merit ' EXERCISES 


1. A letter is chosen at random from the word “random.” What is the 

probability that it is an n ? That it is a vowel? [Ans. J.] 

2. An integer between 3 and 12 inclusive is chosen at random. What is 
the probability that it is an even number? That it is even and divisible by 
three? 

3. A card is drawn at random from a pack of playing cards. 

(a) What is the probability that it is either a heart or the king of 

clubs? [Ans. i/e*] 

(b) What is the probability that it is either the queen of hearts or an 

honor card (i.e., ten, jack, queen, king, or ace)? [Ans. fV] 

4. A word is chosen at random from the set of words <11 = {men, bird, 
ball, field, book}. Let p, q, and r be the statements: 

p: The word has two vowels. 

q : The first letter of the word is b. 

r: The word rhymes with cook. 
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Find the probability of the following statements: 

(a) p. 

(b) q. 

(c) r. 

(d) V A q. 

(e) (p V q) A ~r. 

(f) V [Arcs. t] 

5. A single die is thrown. Find the probability that 

(a) An odd number turns up. 

(b) The number which turns up is greater than two. 

(c) A seven turns up. 

6. In the primary voting example of Chapter II, Section 1, assume that 
all 36 possibilities in the elections are equally likely. Find 

(a) The probability that candidate A wins more states than either of 

his rivals. [Arts. T V] 

(b) That all the states are won by the same candidate. [ Ans . 3V.] 

(c) That every state is won by a different candidate. [Ans. 0.] 

7. A single die is thrown twice. What value for the sum of the two out¬ 
comes has the highest probability? What value or values of the sum has the 
lowest probability of occurring? 

8. Two boys and two girls are placed at random in a row for a picture. 
What is the probability that the boys and girls alternate in the picture? 

[Ans. 1J 

9. A certain college has 500 students and it is known that: 

300 read French. 

200 read German. 

50 read Russian. 

20 read French and Russian. 

30 read German and Russian. 

20 read German and French. 

10 read all three languages. 

If a student is chosen at random from the school, what is the probability that 
the student: 

(a) Reads two and only two languages? 

(b) Reads at least one language? 

10. Suppose that three people enter a restaurant which has a row of six 
seats. If they choose their seats at random, what is the probability that they 
sit with no seats between them? What is the probability that there is at 
least one empty seat between any two of them? 
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11. Find the probability of obtaining each of the following poker hands. 
(A poker hand is a set of five cards chosen at random from a deck of 52 cards.) 

(a) Royal flush (ten, jack, queen, king, ace in a single suit.) 

[Ans. 4/( 5 §) = .0000015.] 

(b) Straight flush (five in a sequence in a single suit, but not a royal 

flush). [Am. (40 - 4)/( 6 §) = .000014.] 

(c) Four of a kind (four cards of the same face value). 

[Am. 624/( 5 i) = .00024.] 

(d) Full house (one pair and one triple of the same face value). 

[Am. 3744/( 5 i) = .0014.] 

(e) Flush (five cards in a single suit but not a straight or royal flush). 

[Am. (5148 - 40)/( 5 i) = .0020.] 

(f) Straight (five cards in a row, not all of the same suit). 

[Am. (10,240 - 40)/( 6 |) = .0039.] 

(g) Straight or better. [Am. .0076.] 

12. If ten people are seated at a circular table at random, what is the 
probability that a particular pair of people are seated next to each other? 

[Ans. %.] 

13. A room contains a group of n people who are wearing badges numbered 

from 1 to n. If two people are selected at random, what is the probability 
that the larger badge number is a 3? Answer this problem assuming that 
n = 5, 4, 3, 2. [Ans. i; i; f; 0.} 

14. In Exercise 13, suppose that we observe two men leaving the room and 
that the larger of their badge numbers is 3. What might we guess as to the 
number of people in the room? 


15. Find the probability that a bridge hand will have suits of: 


(a) 5, 4, 3, and 1 cards. 

(b) 6, 4, 2, and 1 cards. 

(c) 4, 4, 3, and 2 cards. 

(d) 4, 3, 3, and 3 cards. 


[Ans. 


~ 1291 

(f§) = - i29 - ] 

[Ans. .047.] 
[Ans. .216.] 
[Ans. .105.] 


16. There are (?f) = 6.35 X 10 u possible bridge hands. Find the prob¬ 
ability that a bridge hand dealt at random will be all of one suit. Estimate 
roughly the number of bridge hands dealt in the entire country in a year. 
Is it likely that a hand of all one suit will occur sometime during the year in 
the United States? 


4. TWO NONINTUITIVE EXAMPLES 

There are occasions in probability theory when one finds a problem 
for which the answer, based on probability theory, is not at all in 
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agreement with one’s intuition. It is usually possible to arrange a 
few wagers that will bring one’s intuition into line with the mathe- 
matical theory. A particularly good example of this is provided by 
the matching birthdays problem. 

Assume that we have a room with r people in it and we propose the 
bet that there are at least two people in the room having the same 
birthday, i.e., the same month and day of the year. We ask for the 
value of r which will make this a fair bet. Few people would be willing 
to bet even money on this wager unless there were at least 100 people 
in the room. Most people would suggest 150 as a reasonable number. 
However, we shall see that with 150 people the odds are approxi¬ 
mately 4,500,000,000,000,000 to 1 in favor of two people having the 
same birthday, and that one should be willing to bet even money 
with as few as 23 people in the room. 

Let us first find the probability that in a room with r people, no 
two have the same birthday. There are 365 possibilities for each 
person’s birthday (neglecting February 29). There are then 365 r pos¬ 
sibilities for the birthdays of r people. We assume that all these 


Number of people 
in the room 

Probability of 
at least two with 
same birthday 

Approximate odds 

for a fair bet j 

5 

.027 


10 

.117 


15 

.253 


20 

.411 

70:100 

21 

.444 

80:100 

22 

.476 

91:100 

23 

.507 

103:100 

24 

.538 

117:100 

25 

.569 

132:100 

30 

.706 

242:100 

40 

.891 

819:100 

50 

.970 

33:1 

60 

.994 

169:1 

70 


1,200:1 

80 


12,000:1 

90 


160,000:1 

100 


3,300,000:1 

125 


31,000,000,000:1 

150 


4,500,000,000,000,000:1 


Figure 1 
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possibilities are equally likely. To find the probability that no two 
have the same birthday we must find the number of possibilities for 
the birthdays which have no day represented twice. The first person 
can have any of 365 days for his birthday. For each of these, if the 
second person is to have a different birthday, there are only 364 pos¬ 
sibilities for his birthday. For the third man, there are 363 possibili¬ 
ties if he is to have a different birthday than the first two, etc. Thus 
the probability that no two people have the same birthday in a group 
of r people is 

365 • 364 *. , . • (365 - r + 1) 
qr ~ 365 r 

The probability that at least two people have the same birthday 
is then p T = l — g r . In Figure 1 the values of p r and the odds for a 
fair bet, p r :(l — p r ), are given for several values of r. 

We consider now a second problem in which intuition does not lead 
to the correct answer. We have seen that there are n\ permutations 
of the numbers from 1 to n. Let us consider a rearrangement of these 
numbers as the operation of placing each of the numbers in one of n 
boxes or positions (one number to a position). The positions are as¬ 
sumed to be numbered in serial order. We shall say that the ith num¬ 
ber is unchanged by the permutation if, after the rearrangement, 
number i is still in the ^th position. For example, if we consider the 
permutations of the numbers 1, 2, and 3, then the permutation 123 
leaves all numbers fixed, the permutation 213 leaves one number 
fixed, and the permutations 312 and 231 leave no numbers fixed. It 
is obviously impossible, in this example, to leave exactly two numbers 
fixed. (Why?) 

Definition. A complete permutation is one that leaves no numbers 
fixed. 

The problem that we now consider can be stated as follows. If a 
permutation of n numbers is chosen at random, what is the probability 
that the permutation chosen is a complete permutation? A more 
colorful but equivalent problem is the following. A hat-check girl has 
checked n hats, but they have become hopelessly scrambled. She 
hands back the hats at random. What is the probability that no man 
gets his own hat? For this problem some people’s intuition would 
lead them to guess that for a large number of hats this probability 
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should be small, while others guess that it should be large. Few people 
guess that the probability is neither large nor small and essentially 
independent of the number of hats involved. 

To find the desired probability, we assume that all n\ permutations 
are equally likely, and hence we need only count the number of com¬ 
plete permutations which there are for n elements. Let w n be the 
number of such permutations. Then the desired probability is 
p n = Wn/nl. If this procedure is carried out (see Exercise 11), the, 
answer is found to be 

_ * 

Vn 2! 3! ' 4! ' ' n\ 

where the + sign is chosen if n is even and the — sign if n is odd. In 
Figure 2, these numbers are given for the first few values of n. 


Number 
of hats 

Probability p n that 
no man gets his hat 

2 

.500000 

3 

.333333 

4 

.375000 

5 

.366667 

6 

.368056 

7 

.367857 

8 

.367882 


Figure 2 


It can be shown that, as the number of hats increases, the prob¬ 
abilities approach a number 1/e = .367879 . . . , where the number 
e = 2.718281 ... is a number that plays an important role in many 
branches of mathematics. 


EXERCISES 

1. What odds should you be willing to give on a bet that at least two 
people in the United States Senate have the same birthday? 

[Ans. More than 160,000:1.] 

2. What is the probability that in the House of Representatives at least 
two men have the same birthday? 
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3- What odds should you be willing to give on a bet that at least two of 
the Presidents of the United States have had the same birthday? Would 
you win (the feet? 

[Am. More than 3:1; Yes. Polk and Harding were bom on Nov. 2.] 

4. What odds should you be willing to give on the bet that at least two 
of the Presidents of the United States have died on the same day of the year? 
Would you win the bet? 

[Am* More than 2.4:1; Yes. Jefferson, Adams, and Monroe all died 
on July 4..J 

5. Four men check their hats. Assuming that the hats are returned at 

random, what is the probability that exactly four men get their own hats? 
Calculate the .answer for exactly 3, 2, 1, 0 men. [Ans. ah; 0; J; J; f.] 

6. A group of SG men and their wives attend a dance. The partners for a 
dance are chosen by tot. What is the approximate probability that no man 
dances with his wife? 

7. Show that the probability that, in a group of r people, exactly one 
Tpair has the same birthday is 

/ r\ 365-364 . . . (365 - r + 2) 
r * \2/ 365 r 

8. Show that tr l r . J —’ where t r is defined in Exercise 7, and q T 

\2/ 366 t 

is the probability that no pair has the same birthday. 

9. Using the result of Exercise 8 and the results given in Figure 1, find 
the probability of exactly one pair of people with the same birthday in a 
group of r people, for r == 15, 20, 25, 30, 40, and 50. 

[Ans. .22; .32; .38; .38; .26; .12.] 

10. What is the approximate probability that there has been exactly one 
pair of Presidents with the same birthday? 

11. Let w n be the number of complete permutations of n numbers. 

(a) Show that 

wi — 0, w 2 == 1,, ,,, 

w n = (ft — 1 )w n -% 4- (n — l)w«_2 n = 2, 3, . . . 

{Hint: Any complete permutation of n numbers can be obtained from 
a complete permutation of n — 1 numbers or from a permutation of 
ft — 1 numbers that leaves one number fixed. Describe how this can 
be done, and show that the two terms on the right side of the equa¬ 
tion represent the number that can be obtained from each of these 
methods,) 
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(b) Let p n be the probability that a permutation of n numbers chosen 
at random is a complete permutation. From part (a) show that 

n 1 

Pi = 0 , P 2 = ^ 

p n = Pn- 1 + ^ Pn -2 for n = 3, 4, . . . 

(c) Let v n = p n — p n -1 forn = 2, 3, 4,. . . . From part (b), show that 

™(Pn — Pn—l) = “ (Pn-1 “ P«-2), 71 = 3, , 

and hence that 


— ^n—lj W 3, . . . . 

(d) Using the fact that pi — 0, and p 2 = find v 2 . From the result 

of part (c) find v 3 , 04 , , t>». 

(e) Using the result of part (d), show that 





5. CONDITIONAL PROBABILITY 

Suppose that we have a given <11 and that measures have been as¬ 
signed to all subsets of *11. A statement p will have probability 
Pr[p] = m(P). Suppose we now receive some additional informa¬ 
tion, say that statement q is true. How does this additional infor¬ 
mation alter the probability of p ? 

The probability of p after the receipt of the information q is called 
its conditional probability , and it is denoted by Pr[p|g], which is read 
“the probability of p given q.” In this section we will construct a 
method of finding this conditional probability in terms of the meas¬ 
ure m. 

If we know that q is true, then the original possibility set <11 has 
been reduced to Q and therefore we must define our measure on the 
subsets of Q instead of on the subsets of Tl. Of course, every non¬ 
empty subset X of Q is a subset of <11, and hence we know m(X ), its 
measure before q was discovered. Since q cuts down on the number 
of possibilities, its new measure ra'(X) should be larger. 

The basic idea on which the definition of m' is based is that, while 
we know that the possibility set has been reduced to Q, we have no 
new information about subsets of Q. If X and Y are subsets of Q, 
and m(X) = 2*m(F), then we will want m'(X) = 2This 
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will be the case if the measures of subsets of Q are simply increased 
by a proportionality factor m'(X ) = k-m(X), and all that remains is 
to determine k. Since we know that 1 = m'(Q) = k-m(Q ), we see 
that k = l/m(Q) and our new measure on subsets of ‘U is determined 
by the formula 


( 1 ) 


m'(X) 


m(X) , 
m(Q) 


How does this affect the probability of p? First of all the truth set 
of p has been reduced. Because all elements of Q have been elimi¬ 
nated, the new truth set of p is P Q and therefore 


( 2 ) 


Pr[p|g] = m'(P C\ Q ) 


m(P n Q ) _ Pr[p A q ] # 
m(Q) “ Pr[g] 


Note that if the original measure m is the equiprobable measure, then 
the new measure mf will also be the equiprobable measure on the 
set Q . 

We must take care that the denominators in (1) and (2) be different 
from zero. Observe that m(Q) will be zero if Q is the empty set, which 
happens only if q is self-contradictory. This is also the only case in 
which Pr[g] = 0, and hence we make the obvious assumption that 
our information q is not self-contradictory. 


Example 1. In an election, candidate A has a .4 chance of winning, 
B has .3 chance, C has .2 chance, and D has .1 chance. Just before 
the election C withdraws. What are now the chances of the other 
three candidates? Let q be the statement that C will not win, i.e., 
that A or B or D will win. Observe that Pr[g] = .8, hence all the 
other probabilities are increased by a factor of 1/.8 = 1.25. Candi¬ 
date A now has .5 chance of winning, B has .375, and D has .125. 


Example 2. A family is chosen at random from the set of all 
families having exactly two children (not twins). What is the prob¬ 
ability that the family has two boys, if it is known that there is a boy 
in the family? Without any information being given, we would assign 
the equiprobable measure on the set 'll = {BB, BG, GB, GG} where 
the first letter of pair indicates the sex of the younger child and the 
second that of the older. The information that there is a boy causes 
at to change to {BB, BG, GB}, but the new measure is still the equi- 
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probable measure. Thus the conditional probability that there are 
two boys given that there is a boy is J. If on the other hand, we know 
that the first child is a boy, then the conditional probability is 

A particularly interesting case of conditional probability is that 
in which Pr[p|g] = Pr[p]. Here the new information q has no effect 
on the probability of p, and we then say that p is independent of q. 
If in (2) we replace Pr[p|g] by Pr[p], and cross-multiply, we get 

(3) Pr[p A q] = Pr[p] -Pr[g]. 

On the other hand, if we express the condition that q is independent 
of p , we arrive at the same result. Hence the two statements are 
independent of each other. We can therefore say that p and q are 
independent if and only if case (3) holds. 

Example 3. Consider three throws of an ordinary coin, where we 
consider the eight possibilities to be equally likely. Let p be the state¬ 
ment, “A head turns up on the first throw, ” and q be the statement, 
“A tail turns up on the second throw/’ Then Pr[p] == Pr [q] = \ and 
Pr[p A q] = i and therefore p and q are independent statements. 

While we have an intuitive notion of independence, it can happen 
that two statements, which may not seem to be independent, are in 
fact independent. For example, let r be the statement, “The same 
side turns up all three times.” Let s be the statement “At most one 
head occurs/’ Then r and s are independent statements (see Exercise 
10 ). 


EXERCISES 

1. A card is drawn at random from a pack of playing cards. What is the 
probability that it is a 5, given that it is between 2 and 7 inclusive? 

2. There are 200 participants in a raffle. How much is one’s chance of 
winning increased if 125 other names are eliminated? 

3. A die is thrown twice. What is the probability that the sum of the 

faces which turn up is greater than 10, given that one of them is a 6? Given 
that the first throw is a 6? [Ans, A; J.] 

4. Referring to Chapter IV, Section 3, Exercise 9, what is the prob¬ 
ability that the man selected studies German if: 

(a) He studies French? 



132 


PROBABILITY THEORY 


[Chap. IV 


(b) He studies French and Russian? 

(c) He studies neither French nor Russian? 

5. In the primary voting example of Chapter II, Section 1, assuming 

that the equiprobable measure has been assigned, find the probability that 
A wins at least two primaries, given that B drops out of the Wisconsin 
primary? [Arts. 5-.] 

6 . If Pr[~p] = i and Pr[g|p] = h what is Pr[p A g]? [Ans. f.] 

7. A student takes a five-question true-false exam. What is the prob¬ 
ability that he will get all answers correct if: 

(a) He is only guessing? 

(b) He knows that the instructor puts more true than false questions 
on his exams? 

(c) He also knows that the instructor never puts three questions in a 
row with the same answer? 

(d) He also knows that the first and last questions must have the 
opposite answer? 

(e) He also knows that the answer to the second problem is 4 ‘false”? 

8 . Three persons, A, B, and C, are placed at random in a straight line. 
Let r be the statement, “B is to the right of A,” and let s be the statement, 
“C is to the right of A.” 

(a) What is the Pr[r As]? [Ans. -J-] 

(b) Are r and s independent? [Ans. No.] 

9 . Let a deck of cards consist of the jacks and queens chosen from a 
bridge deck, and let two cards be drawn from the new deck. Find: 

(a) The probability that the cards are both jacks, given that one is a 

jack. [Ans. xx = 0.27.] 

(b) The probability that the cards are both jacks, given that one is a 

red jack. [Ans. = 0.38.] 

(c) The probability that the cards are both jacks, given that one is 

the jack of hearts. [Ans. f = 0.43.] 

10. Prove that statements r and s in Example 3 are independent. 

11 . The following example shows that r may be independent of p and q 

without being independent of p A q and p V q. We throw a coin twice. 
Let p be “The first toss comes out heads,” q be “The second toss comes out 
heads,” and r be “The two tosses come out the same.” Compute Pr[r], 
Pr[r|p], Pr[r|g], Pr[r|p A g], Pr[r|p V g]. [Ans. f, J, 1, £.] 

12. Prove that for any two statements p and q, 

Pr[p] = Pr[p A g] + Pr [p A ~g]. 
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13. Ass um e that p and q are independent statements. Prove that each of 
the following pairs of statements are independent. 

(a) p and ~q. 

(b) and p. 

(c) ~p and ~q. 

14. Prove that for any three statements p, q, and r, 

Pr[p A q A r] = Pr[p] Pr[g|p] Pr[r|p A g]. 


*6. MEASURES AS AREAS 

For many purposes it is convenient to represent measures by means 
of areas. We draw the universal set Tl as a unit square, and to each 
element of Tl we assign an area equal to its weight. By choosing these 
areas nonoverlapping, we will have assigned all of Tl. Then to any 
set we assign the total area of its elements, which will be equal to its 
measure. 


Example 1. Let TL = {a, 6, c}, and assign 
the weights -J, % to a, 6, and c, respectively. 

Figure 3 shows the corresponding areas, and 
also shows the area associated with {a, c} 
(shaded in the figure). The latter area is, of 
course, h + i = h 


It is not necessary to start with the weights. 

We can draw in the subsets directly as long 
as one condition is satisfied: each set represented in our diagram has 
area equal to its measure. 

The geometric representation helps to clarify many theoretical con¬ 
siderations. As an illustration we will consider the measure of the 
union of two sets m(P) = .5, m(Q) — .3. What is m{P U Q)? To P 
we assign an area of .5, and then we want to add Q to our diagram. 
The set Q is assigned an area of .3, but how much is this area to over¬ 
lap PI If (a) there is no overlap, then the total area is .8; while (b) 
if Q is inside P, then the total area is .5; finally (c) if the overlap is, 
say, equal to .2, then the total area is .6. (These three possibilities 
are shown in Figure 4.) Since, in each case, the overlap represents 
P r\ Q, we have no choice: we must make the area of the overlap 
equal to m(P HQ). It is easy to read the formula ra(P \J Q) = 
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m(P) + m(Q) — m(P P Q) from Figure 4(c). Figure 4(a) shows the 
case where P and Q are disjoint. 



(a) 



Figure 4 



(c) 


If we have just one subset of c ll, say P, we can always represent it 
by a vertical strip, i.e., a rectangle with height equal to one. 
But if Q is added, we cannot always represent it by a complete hori¬ 
zontal strip; as can be seen in Figure 4. Let us consider the special 
case, shown in Figure 5, where such representation is possible. The 
set P has base equal to m(P) and height one. The set Q has base one 
and height m(Q). The intersection, P P Q, has area m(P) •m(Q). 
That means that m(P P Q) = m(P) •m(Q) which is the special case 
of independence. 



Figure 5 Figure 6 


To represent the probabilities of statements, we must represent 
the measures of their truth sets. Hence the above method is applicable 
to the representation of probabilities as well. It can be used also to 
represent conditional probabilities: Pr[p|g] = Pr[p A q]/Pr[q] = 
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m(P C\ Q)/m{Q), Hence the conditional probability is represented 
by the ratio of the common area of P and Q to the total area of Q. 
In Figure 6 let m(P) = x + y, m(Q) = x + z, and m{P P Q) = x; 
then Pr[p|g] = x/(x + z) and Pr[#|p] — x/(x + y ). 

As an application of this geometric representation we will develop 
Bayes* theorem. The simplest case of this theorem is where we know 
the probability of p occurring, and later we get additional evidence q 
whose relation to p is known. The question is how the new informa¬ 
tion changes the probability of p. More specifically, we know Pr[p], 
Pr[q\p] y and Pr[g|^p] to start with. What we want to know is 
Pr[p| ? ]. 

Example 2. Suppose we have two urns. The first contains two 
black balls and one white ball, while the second contains two white 
balls and one black ball. We select an urn according to a random 
device, which makes the probability of choosing the first urn f, and 
then draw a ball. If a black ball is drawn, what is the probability 
that we drew from the first urn? Let p state that we draw from the 
first urn, and q that we draw a black ball. Pr[p] — f, Pr[g|p] = f, 
Pr[^|^p] = b We are to find Pr[p|^]. These numbers are rep¬ 
resented in Figure 7. Observe that xi + yi — f, x 2 + y* = 
xi/(x x + yi) — f, and x 2 /(x 2 + y 2 ) = b Therefore, x x = (f)(f) = |, 
and x 2 = (i)(i) = ts- Finally Pr[p\q] = x x /{x x + x 2 ) = f. 



*1 

X 2 

*3 

*4 

'— 





yz 

% 


Pi 

Pi 


Pi 


Figure 8 


Q 


Q 


Here we started with only two alternatives, p and ~p. But the 
theorem to be developed is applicable to any number of alternatives. 
We will work it out for four alternatives. The statements p h p 2 , p 3 . 
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and p 4 are said to form a complete set of alternatives if one of them 
must be true and no more than one can be true. (See Chapter I, 
Section 8.) In this case their truth sets are disjoint, and their union 
is c ll. Let q be some other statement. This situation is represented in 
Figure 8. 

As initial data we are given the probabilities of the four alternatives, 
and we know the conditional probability of q relative to each alter¬ 
native. 


Pr[pi] = xi + y x 
Pr[p 2 ] = x 2 + y* 
Pr[p 3 ] = x z + y z 
Pr[p 4 ] = x 4 + 1/4 


Prfelpi] = xi/(xi + yi) 
Pr[g|p 2 ] = x 2 /(x 2 + y 2 ) 
Pr[g|p 3 ] = x z /{xz + yz) 
Pr[g|p 4 ] = z 4 /(x 4 + 1/0 


Thus we are given the areas of the four vertical strips and know 
in each case what fraction the upper portion of the strip is, so 
that we have the area of the four upper portions: By multiplying 
the two probabilities on each line we obtain x\ = PrfpJ -Pr[g|pi], 
X 2 = Pr[p 2 ] *Pr[g|p 2 ], etc. We are interested in finding the probability 
of one of the four alternatives, say p 2 , given that q has taken place. 
In other words we want to know what fraction x 2 is of the area of Q . 
The formula 


Pr[p*|g] = 


PrM-Pr [q\p 2 ] 


Pr[pi] • Pr[g|pi]+Pr[p 2 ] * Pr[g|p 2 ]+Pr[p 3 ] • Pr[g|p 3 ]+Pr[p 4 ] * Pr[g|p 4 ] 


gives the desired probability, as can be checked. Similar formulas 
apply for the other alternatives, and the formula generalizes in an 
obvious way to any number of alternatives. In its most general form 
it is called Bayes* theorem. 


Example 3- Suppose that a freshman must choose among mathe¬ 
matics, physics, chemistry, and astronomy as his science course. On 
the basis of the interest he expressed, his adviser assigns probabilities 
of .4, .3, .2, and .1 to his choosing each of the four courses, respec¬ 
tively. His adviser does not hear which course he actually chose, but 
at the end of the term the adviser hears that he received A in the 
.course chosen. On the basis of the difficulties of these courses the 
.adviser estimates the probability of the student getting an A in 
paathematics to be .1, in physics .2, in chemistry .3, and in astronomy 
h .9. How can the adviser revise his original estimates as to the prob- 
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abilities of his taking the various courses? Using Bayes’ theorem we 
get 


Pr[He took Math|He got an A] = 

__ (.4)(.l) _ 

(.4) (-1) + (.3) (.2) + (.2)(.3) + 01)(.9) 



Similar computations assign probabilities of .24, .24, and .36 to the 
other three courses. Thus the new information, that he received an 
A, had little effect on the probability of his having taken physics or 
chemistry, but it has made it much less likely that he took mathe¬ 
matics, and much more likely that he took astronomy. 


It is important to note that knowing the conditional probabilities 
of q relative to the alternatives is not enough. Unless we also know 
the probabilities of the alternatives at the start, we cannot apply 
Bayes’ theorem. However, in some situations it is reasonable to as¬ 
sume that the alternatives are equally probable at the start. In this 
case the factors Pr[pi], . . ., Pr[p 4 ] cancel from our basic formula, 
and we get the special form of the theorem: 

If Pr[pi] = Pr[p 2 ] = Pr[p 3 ] = Pr[p 4 ], then 

-p r , _ Pr[<?lff2] _ 

p r [g|pj + Pr[g|p 2 ] + Pr[g|p 3 ] + Pr[g|p 4 ] 

Example 4. In a sociological experiment the subjects are handed 
four sealed envelopes, each containing a problem. They are told to 
open one envelope and try to solve the problem in ten minutes. From 
past experience, the experimenter knows that the probability of their 
being able to solve the hardest problem is .1. With the other problems 
they have probabilities of .3, .5, and .8. Assume the group succeeds 
within the allotted time. What is the probability that they selected 
the hardest problem? Since they have no way of knowing which 
problem is in which envelope, they choose at random, and we assign 
equal probabilities to the selection of the various problems. Hence the 
above simple formula applies. The probability of their having selected 
the hardest problem is .!/(.! + .3 + .5 + .8) = tt- 


Example 5. Suppose that we can play any one of three slot ma¬ 
chines, which usually pay off only with probability .1. Suppose also 
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that a friend has told us that one of the machines is out of order and 
pays off with probability .4. We select one machine at random, and 
play. How does the outcome affect the probability that we have 
selected the profitable machine? If we win, then the probability that 
we have selected the right machine is .4/(.4 + .1 + .1) = f. If we 
lose, the probability is .6/(.6 + .9 + .9) = |. Since we started with 
probability J, we see that a win gives us much more information than 
a loss. This is reasonable, since a loss is quite likely even on the 
profitable machine. 


EXERCISES 

1. Represent the sets P and Q as areas of a square, given the following 
information: 

(a) m(P) = m(Q) = .5, m(P Pi Q) = .3. 

(b) m(P) = .5, m(Q) = .4, m{P VJ Q) = .8. 

(c) m(P) = .6, m(Q) = .4, P and Q are disjoint. 

2. Given the following information about statements p and q, represent 
their truth sets P and Q as areas: 

(a) Pr[p] = PrM = A, Pr[p Ag] = .2. 

(b) Pr[p] = Pr [q] = .5, Pr [p V q] = .75. 

(c) Pr[p] = Pr[g] = Pr[p A q] = .3. 

(d) Pr [p] = Pr[ff] = .4, Pr[g|p] - .5. 

3. During the month of May the probability of a rainy day is .2. The 

Dodgers win on a clear day with probability .7, but on a rainy day only with 
probability .4. If we know that they won a certain game in May, what is the 
probability that it rained on that day? [ Ans . i.] 

4. Construct a diagram to represent the truth sets of various statements 
occurring in the previous exercise. 

5. On a multiple-choice exam there are four possible answers for each 
question. Therefore, if a student knows the right answer, he has probability 
one of choosing correctly; if he is guessing, he has probability I of choosing 
correctly. Let us further assume that a good student will know 90 per cent 
of the answers, a poor student only 50 per cent. If a good student has the 
right answer, what is the probability that he was only guessing? Answer the 
same question about a poor student, if the poor student has the right answer. 

[Ans. $■.] 

6. Three economic theories are proposed at a given time, which appear to 
be equally likely on the basis of existing evidence. The state of the American 
economy is observed the following year, and it turns out that its actual de- 
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velopment had probability .6 of happening according to the first theory; and 
probabilities .4 and .2 according to the others. How does this modify the 
probabilities of correctness of the three theories? 

7. Let p h p 2 , Ph and p 4 be a set of equally likely alternatives. Let Pr[g|pJ 
= a, Pr[<# 2 ] = b, Pr[g|p 3 ] = c, Pr[g|p 4 ] = d. Show that, if a+b+c+d 
= 1, then the revised probabilities of the alternatives relative to q are 
a, b , c, and d } respectively. 

8. In poker, S mi th holds a very strong hand and bets a considerable 
amount. The probability that his opponent, Jones, has a better hand is .05. 
With a better hand Jones would raise the bet with probability .9, but with a 
poorer hand Jones would raise only with probability .2. Suppose that Jones 
raises, what is the new probability that he has a winning hand? [Ans. ^.] 

9. A rat is allowed to choose one of five mazes at random. If we know that 
the probabilities of his getting through the various mazes in three minutes 
are .6, .3, .2, .1, .1, and we find that the rat escapes in three minutes, how 
probable is it that he chose the first maze? The second maze? [Ans. A, -ft.] 

7. FINITE STOCHASTIC PROCESSES 

We consider here a very general situation which we will specialize 
in later sections. We deal with a sequence of experiments where the 
outcome on each particular experiment depends on some chance ele¬ 
ment. Any such sequence is called a stochastic process . (The Greek 
word “stochos” means “guess”)- We shall assume a finite number of 
experiments and a finite number of possibilities for each experiment. 
We assume that, if all the outcomes of the experiments which precede 
a given experiment were known, then both the possibilities for this 
experiment and the probability that any particular possibility will 
occur would be known. We wish to make predictions about the proc¬ 
ess as a whole. For example, in the case of repeated throws of an 
ordinary coin we would assume that on any particular experiment we 
have two outcomes, and the probabilities for each of these outcomes 
is one-half regardless of any other outcomes. We might be interested, 
however, in the probabilities of statements of the form, “More than 
two-thirds of the throws result in heads,” or “The number of heads 
and tails which occur is the same,” etc. These are questions which 
can be answered only when a probability measure has been assigned 
to the process as a whole. In this section we show how probability 
measure can be assigned, using the given information. In the case 
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of coin tossing, the probabilities (hence also the possibilities) on any 
given experiment do not depend upon the previous results. We will 
not make any such restriction here since the assumption is not true 
in general. 

We shall show how the probability measure is constructed for a 
particular example, and the procedure in the general case is similar. 

We assume that we have a sequence of three experiments, the pos¬ 
sibilities for which are indicated in Figure 9. The set of all possible 

outcomes which might occur on 
any of the experiments is repre¬ 
sented by the set {a, 6, c, d , e, /}. 
Note that if we know that out¬ 
come b occurred on the first ex¬ 
periment, then we know that 
the possibilities on experiment 
two are {a, e, d }. Similarly if we 
know that b occurred on the first 
experiment and a on the second, 
then the only possibilities for the 
third are {c, /}. We denote by 
p a the probability that the first 
experiment results in outcome a, and by the probability that out¬ 
come b occurs in the first experiment. We denote by Pb,d the prob¬ 
ability that outcome d occurs on the second experiment, which is the 
probability computed on the assumption that outcome b occurred on 
the first experiment. Similarly for p h , a , Pb,e , Pa,a, p«, c . We denote by 
p b d,c the probability that outcome c occurs on the third experiment, 
the latter probability being computed on the assumption that out¬ 
come b occurred on the first experiment and d on the second. Simi¬ 
larly for pba,c, Pba,/, etc. We have assumed that these numbers are 
given and the fact that they are probabilities assigned to possible 
outcomes would mean that they are positive and that 



Figure 9 


Pa + Pb — 1, Pb,a + Pb y e + Pb,d = 1, and pbd,a + Pbd t c = 1, etc. 

It is convenient to associate each probability with the branch of the 
tree that connects the branch point representing the predicted out¬ 
come. We have done this in Figure 9 for several branches. The sum 
of the numbers assigned to branches from a particular branch point 
is one, e.g., p b , a + Pb,e + Pb,d = 1. 
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A possibility for the sequence of three experiments is indicated by 
a path through the tree. We define now a probability measure on 
the set of all paths. We call this a tree measure. To the path corre¬ 
sponding to outcome b on the first experiment, d on the second, and c 
on the third, we assign the weight p b 'Pb,d'Pbd,c- That is the product 
of the probabilities associated with each branch along the path being 
considered. We find the probability for each path through the tree. 

Before showing the reason for this choice we must first show that 
it determines a probability measure, in other words that the weights 
are positive and the sum of the weights is one. The weights are 
products of positive numbers and hence positive. To see that their 
sum is one we first find the sum of the weights of all paths correspond¬ 
ing to a particular outcome, say 6, on the first experiment and a par¬ 
ticular outcome, say d , on the second. We have 

Pb*Pb,d m Pbd,a 4" Pb'Pb,d'Pbd,c — Pb * Pb,d[pbd,a 4" Pbd t c\ = Pb‘Pb,d» 

For any other first two outcomes we would obtain a similar result. 
For example, the sum of the weights assigned to paths corresponding 
to outcome a on the first experiment and c on the second is p a ’Pa,c- 
Notice that when we have verified that we have a probability meas¬ 
ure, this will be the probability that the first outcome results in a and 
the second experiment results in c. 

Next we find the sum of the weights assigned to all the paths cor¬ 
responding to the cases where the outcome of the first experiment is b. 
We find this by adding the sums corresponding to the different pos¬ 
sibilities for the second experiment. But by our preceding calculation 
this is 

Pb-Pb,a 4* Pb-Pb,e 4- Pb-Pb.d = Pb[Pb,a 4“ Pb,e + Pb,d] = P&- 

Similarly the sum of the weights assigned to paths corresponding 
to the outcome a on the first experiment is p a . Thus the sum of all 
weights is p a + Pb = 1. Therefore we do have a probability measure. 
Note that we have also shown that the probability that the outcome 
of the first experiment is a has been assigned probability p a in agree¬ 
ment with our given probability. 

To see the complete connection of our new measure with the given 
probabilities, let Xj = z be the statement “The outcome of the jth 
experiment was z.” Then the statement [Xi = b A X 2 — d A Xz — c] 
is a compound statement that has been assigned probability 
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Pb’Pb,d‘Pbd,c . The statement [Xi = b A X 2 = d] we have noted has 
been assigned probability Pb-Pb,d and the statement [Xi = 6 ] has 
been assigned probability pb . Thus 

Pr[X 3 = c|X 2 = d A Xi = b] = = ^ 

Pb'Pb,d 

Pr[X 2 = d|X x = b] = 

Pb 

Thus we see that our probabilities, computed under the assumption 
that previous results were known, become the corresponding condi¬ 
tional probabilities when computed with respect to the tree measure. 
It can be shown that the tree measure which we have assigned is the 
only one which will lead to this agreement. We can now find the 
probability of any statement concerning the stochastic process from 
our tree measure. 


Example 1. Suppose that we have two urns. Urn 1 contains two 
black balls and three white balls. Urn 2 contains two black balls and 
one white ball. An urn is chosen at random and a ball chosen from 
this urn at random. What is the probability that a white ball is 
chosen? A hasty answer might be since there are an equal number 
of black and white balls involved and everything is done at random. 

However, it is hasty answers like this 
(which is wrong) which show the need 
for a more careful analysis. 

We are considering two experiments. 
The first consists in choosing the urn 
and the second in choosing the ball. 
There are two possibilities for the first 
experiment, and we assign pi = P 2 — \ 
for the probabilities of choosing the 
first and the second urn, respectively. 
We then assign pi, w = f for the prob¬ 
ability that a white ball is chosen, 
under the assumption that urn 1 is 
chosen. Similarly we assign p lfB — f, P 2 ,w = h P' 2 ,b = §. We indi¬ 
cate these probabilities on the possibility tree in Figure 10. The 
probability that a white ball is drawn is then found from the tree 
measure as the sum of the weights assigned to paths which lead to a 
choice of a white ball. This is J*f + = ^ 5 . 
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Example 2. Suppose that a man leaves a bar which is on a corner 
which he knows to be one block from his home. He is unable to re¬ 
member which street leads to his home. He proceeds to try each of 
the streets at random without ever choosing the same street twice 



until he goes on the one which leads to his home. What possibilities 
are there for his trip home, and what is the probability for each of 
these possible trips? We label the streets A, B, C, and Home. The 
possibilities together with typical probabilities are given in Figure 11. 
The probability for any particular trip, or path, is found by taking 
the product of the branch probabilities. 


Example 3. Assume that you 
chines, A and B. Each machine 
pays the same fixed amount when 
it pays off. Machine A pays off 
each time with probability §, and 
machineBwith probability You 
are not told which machine is A. 
Suppose that you choose a ma¬ 
chine at random and win. What is 
the probability that you chose 
machine A? We first construct the 
tree (Figure 12) to show the possi¬ 
bilities and assign branch proba¬ 
bilities to determine a tree measure 


are presented with two slot ma- 



. Let p be the statement, “Machine 
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A was chosen,” and q be the statement, “The machine chosen paid off.” 

Then we are asked for Pr[p|g] = ^ • The truth set of the 

statement p A q consists of a single path which has been assigned 
weight The truth set of the statement q consists of two paths, and 
the sum of the weights of these paths is irJ + i*i = f. Thus 
Pr[p|g] = f. Thus if we win it is more likely that we have machine 
A than B and suggests that next time we should play the same 
machine. If we lose, however, it is more likely that we have machine 
B than A, and hence we would switch machines before the next play. 
(See Exercise 9.) 


EXERCISES 

1. The fractions of Republicans, Democrats, and Independent voters in 
cities A and B are 

City A: .30 Republican, .40 Democratic, .30 Independent; 

City B: .40 Republican, .50 Democratic, .10 Independent. 

A city is chosen at random and two voters are chosen successively and at 
random from the voters of this city. Construct a tree measure and find the 
probability that two Democrats are chosen. Find the probability that the 
second voter chosen is an Independent voter. [Arts. .205; .2.] 

2. A coin is thrown. If a head turns up a die is rolled. If a tail turns up 
the coin is thrown again. Construct a tree measure to represent the two 
experiments and find the probability that the die is thrown and a six turns up. 

3. A man wins a certain tournament if he can win two consecutive games 

out of three played alternately with two opponents A and B. A is a better 
player than B. The probability of winning a game when B is the opponent 
is f. The probability of winning a game when A is his opponent is only J. 
Construct a tree measure for the possibilities for three games, assuming that 
he plays alternately but plays A first. Do the same assuming that he plays 
B first. In each case find the probability that he will win two consecutive 
games. Is it better to play two games against the strong player or against 
the weaker player? [Ans. iri 2 8 tJ better to play strong player twice.] 

4. Construct a tree measure to represent the possibilities for four throws 
of an ordinary coin. Assume that the probability of a head on any toss is § 
regardless of any information about other throws. 

5. A student claims to be able to distinguish beer from ale. He is given 
a series of three tests. In each test he is given two cans of beer and one of 
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ale and asked to pick out the ale. If he gets two or more correct we will 
admit his claim. Draw a tree to represent the possibilities (either right or 
wrong) for his answers. Construct the tree measure which would correspond 
to guessing and find the probability that his claim will be established if he 
guesses on every trial. 

6. A box contains three defective light bulbs and seven good ones. 
Construct a tree to show the possibilities if three consecutive bulbs are drawn 
at random from the box (they are not replaced after being drawn). Assign 
a tree measure and find the probability that at least one good bulb is drawn 
out. Find the probability that all three are good if the first bulb is good. 

[Ans. lit; A.] 

7. In Example 2 above, find the probability that the man reaches home 
after trying at most one wrong street. 

8. In Example 3, find the probability that machine A was chosen, given 
that the player lost. 

9. In Example 3, assume that the player makes two plays. Find the 
probability that he wins at least once under the assumption: 

(a) That he plays the same machine twice. [Ans. if.] 

(b) That he plays the same machine the second time if and only if he 

won the first time. [Ans. If.] 

10. A chess player plays three successive games of chess. His psychological 
makeup is such that the probability of his winning a given game is (I)* -1 " 1 , 
where k is the number of games he has won so far. (For instance, the prob¬ 
ability of his winning the first game is \, the probability of his winning the 
second game if he has already won the first game is J, etc.) What is the prob¬ 
ability that he will win at least two of the three games? 

11. Before a political convention, a political expert has assigned the follow¬ 
ing probabilities. The probability that the President will be willing to run 
again is f. If he is willing to run, he and his Vice President are sure to be 
nominated and have probability f of being elected again. If the President 
does not run, the present Vice President has probability A of being nomi¬ 
nated, and any other presidential candidate has probability f of being elected. 
What is the probability that the present Vice President will be re-elected? 

[Ans. if.] 

12. 13. Work Exercises 3 and 8 of Section 6, using the method illustrated 
in Example 3 of this section. 

14. There are two urns, A and B. Urn A contains one black and one red 
ball. Urn B contains two black and three red balls. A ball is chosen at 
random from urn A and put into urn B. A ball is then drawn at random 
from urn B. 
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(a) What is the probability that both balls drawn are of the same 

color? [Ans. T V] 

(b) What is the probability that the first ball drawn was red, given 

that the second ball drawn was black? [Ans. f.] 

15. Assume that in the World Series each team has probability one-half 
of winning each game, independently of the outcomes of any other game. 
Assign a tree measure. (See Chapter 1, Section 6 for the tree.) Find the 
probability that the series ends in 4, 5, 6, and 7 games, respectively. 

16. Assume that in the World Series one team is stronger than the other 
and has probability § for winning each of the games. Assign a tree measure 
and find the following probabilities. 

(a) The probability that the stronger team wins in 4, 5, 6, and 7 games, 
respectively. 

(b) The probability that the weaker team wins in 4, 5, 6, and 7 games, 
respectively. 

(c) The probability that the series ends in 4, 5, 6, and 7 games, re¬ 
spectively. [Ans. .21; .30; .27; .22.] 

(d) The probability that the strong team wins the series. [Ans. .83.] 

17. In the World Series from 1905 to 1955, excluding the nine-game series, 
there have been 10 four-game series, 13 five-game series, 12 six-game series, 
and 13 seven-game series. Add the results from 1955 to date to these and use 
these past records to estimate the probability that a series will last 4, 5, 6, 
or 7 games. Compare your answers with those obtained theoretically in 
Exercises 15 and 16(c). Which assumption about the World Series play 
seems to fit the data better? 

8. INDEPENDENT TRIALS WITH TWO OUTCOMES 

In the preceding section we developed a way to determine a prob¬ 
ability measure for any sequence of chance experiments where there 
are only a finite number of possibilities for each experiment. While 
this provides the framework for the general study of stochastic proc¬ 
esses, it is too general to be studied in complete detail. Therefore, in 
probability theory we look for simplifying assumptions which will 
make our probability measure easier to work with. It is desired also 
t hat these assumptions be such as to apply to a variety of experiments 
which would occur in practice. In this book we shall limit ourselves 
to the study of two types of processes. The first, the independent 
trials process, will be considered in the present section. This process 
was the first one to be studied extensively in probability theory. The 
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second, the Markov chain process, is a process that is finding increas¬ 
ing application, particularly in the social and biological sciences, and 
will be considered in Section 13. 

A process of independent trials applies to the following situation. 
Assume that there is a sequence of chance experiments, each of which 
consists of a repetition of a single experiment, carried out in such a 
way that the results of any one experiment in no way affect the results 
in any other experiment. We label the possible outcome of a single 
experiment by ai, . . . , a r . We assume that we are also given prob¬ 
abilities pi , . . . , for each of these outcomes occurring on any single 
experiment, the probabilities being independent of previous results. 
The tree representing the possibilities for the sequence of experiments 
will have the same outcomes from each branch point, and the branch 
probabilities will be assigned by assigning probability pj to any 
branch leading to outcome ay. The tree measure determined in this 
way is the measure of an independent trials process. In this section 
we shall consider the important case of two outcomes for each experi¬ 
ment. The more general case is studied in Section 11. 
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In the case of two outcomes we arbitrarily label one outcome 
“success” and the other “failure.” For example, in repeated throws 
of a coin we might call heads success, and tails failure. We assume 
there is given a probability p for success and a probability q = 1 — p 
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for failure. The tree measure for a sequence of three such experiments 
is shown in Figure 13. The weights assigned to each path are indi¬ 
cated at the end of the path. 

The question which we now ask is the following. Given an inde¬ 
pendent trials process with two outcomes, what is the probability of 
exactly x successes in n experiments. We denote this probability by 
f(n,x;p) to indicate that it depends upon n, x , and p. 

Assume that we had a tree for this general situation, similar to the 
tree in Figure 13 for three experiments, with the branch points labeled 
S for success and F for failure. Then the truth set of the statement, 
“Exactly x successes occur,” consists of all paths which go through 
x branch points labeled S and n — x labeled F. To find the prob¬ 
ability of this statement we must add the weights for all such paths. 
We are helped first by the fact that our tree measure assigns the same 
weight to any such path, namely p x q n ~ x • The reason for this is that 
every branch leading to an S is assigned probability p, and every 
branch leading to F is assigned probability q , and in the product there 
will be x p's and n — x q’a. To find the desired probability we need 
only find the number of paths in the truth set of the statement, “Ex¬ 
actly x successes occur.” To each such path we make correspond an 
ordered partition of the integers from 1 to n which has two cells, 
x elements in the first and n — x in the second. We do this by putting 
the numbers of the experiments on which success occurred in the first 
cell and those for which failure occurred in the second cell. Since 

there are such partitions there are also this number of paths in 

the truth set of the statement considered. Thus we have proved: 


In an independent trials process with two outcomes the probability of 
exactly x successes in n experiments is given by 


f(n,x;p) 



Example 1. Consider n throws of an ordinary coin. We label 
heads “success” and tails “failure,” and we assume that the prob¬ 
ability is | for heads on any one throw independently of the outcome 
of any other throw. Then the probability that exactly x heads will 
turn up is 



Sec. 8] 


PROBABILITY THEORY 


149 


For example, in 100 throws the probability that exactly 50 heads 
will turn up is /(100,50;|) = CSSXD 100 which is approximately .08. 
Thus we see that it is quite unlikely that exactly one-half of the tosses 
will result in heads. On the other hand, suppose that we ask for the 
probability that nearly one-half of the tosses will be heads. To be 
more precise, let us ask for the probability that the number of heads 
which occur does not deviate by more than 10 from 50. To find this 
we must add /(100,x;£) for s = 40, 41, ..., 60. If this is done, we 
obtain a probability of approximately .96. Thus, while it is unlikely 
that exactly 50 heads will occur, it is very likely that the number of 
heads which occur will not deviate from 50 by more than 10. 

Example 2. Assume that we have a machine which, on the basis- 
of data given, is to predict the outcome of an election as either a 
Republican victory or a Democratic victory. If two identical ma¬ 
chines are given the same data, they should predict the same result. 
We assume, however, that any such machine has a certain probability 
q of reversing the prediction that it would ordinarily make, because 
of a mechanical or electrical failure. To improve the accuracy of our 
prediction we give the same data to r identical machines, and choose 
the answer which the majority of the machines give. To avoid ties 
we assume that r is odd. Let us see how this decreases the probability 
of an error due to a faulty machine. 

Consider r experiments, where the jth experiment results is success 
if the jth machine produces the prediction which it would make when 
operating without any failure of parts. The probability of success is 
then p = 1 — g. The majority decision will agree with that of a 
perfectly operating machine if we have more than r/2 successes. 
Suppose, for example, that we have five machines, each of which has 
a probability of .1 of reversing the prediction because of a parts fail¬ 
ure. Then the probability for success is .9, and the probability that 
the majority decision will be the desired one is 

/(5,3;0.9) +/(5,4;0.9) + /(5,5;0.9) 

which is found to be approximately .991 (see Exercise 3). 

Thus the above procedure decreases the probability of error due to 
machine failure from . 1 in the case of one machine to .009 for the case 
of five machines. 
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EXERCISES 

1. Compute for n — 4, n = 8, n = 12, and n = 16 the probability of 
obtaining exactly J heads when an ordinary coin is thrown. 

[Ans. .375; .273; .226; .196.] 

2. Compute for n = 4, n = 8, n = 12, and n = 16 the probability that 
the fraction of heads deviates from § by less than J. 

[Ans. .375; .711; .854; .923.] 

3. Verify that the probability .991 given in Example 2 is correct. 

4. Assume that Peter and Paul match pennies four times. (In matching 
pennies, Peter wins a penny with probability §, and Paul wins a penny with 
probability §.) What is the probability that Peter wins more than Paul? 
Answer the same for five throws. For the case of 12,917 throws. 

[Ans. xV; "!•] 

5. If an ordinary die is thrown four times, what is the probability that 
exactly two 6’s will occur? 

6. In a ten-question true-false exam, what is the probability of getting 

70 per cent or better by guessing? [Ans. H-] 

7. Assume that, every time a batter comes to bat, he has probability .3 
for getting a hit. Assuming that his hits form an independent trials process 
and that the batter comes to bat four times, what fraction of the games would 
he expect to get at least two hits? At least three hits? Four hits? 

[Ans. .348; .084; .008.] 

8. A coin is to be thrown eight times. What is the most probable number 
of heads that will occur? What is the number having the highest probability, 
given that the first four throws resulted in heads? 

9. A small factory has ten workers. The workers eat their lunch at one 

of two diners, and they are just as likely to eat in one as in the other. If the 
proprietors want to be more than .95 sure of having enough seats, how many 
seats must each of the diners have? [Ans. Eight seats.] 

10. Suppose that five people are chosen at random and asked if they favor 
a certain proposal. If only 30 per cent of the people favor the proposal, what 
is the probability that a majority of the five people chosen will favor the 
proposal? 

11. In Example 2, if the probability for a machine reversing its answer 
due to a parts failure is .2, how many machines would have to be used to 
make the probability greater than .89 that the answer obtained would be 
that which a machine with no failure would give? [Ans. Three machines.] 
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12. Assume that it is estimated that a torpedo will hit a ship with prob¬ 
ability J. How many torpedoes must be fired if it is desired that the prob¬ 
ability for at least one hit should be greater than .9? 

13. A student estimates that, if he takes four courses, he has probability 
.8 of passing each course. If he takes five courses he has probability .7 of 
passing each course, and if he takes six courses he has probability .5 for 
passing each course. His only goal is to pass at least four courses. How many 
courses should he take for the best chance of achieving his goal? [Arts. 5.] 

*9. A PROBLEM OF DECISION 

In the preceding sections we have dealt with the problem of cal¬ 
culating the probability of certain statements based on the assump¬ 
tion of a given probability measure. In a statistics problem, one is 
often called upon to make a decision in a case where the decision 
would be relatively easy to make if we could assign probabilities to 
certain statements, but we do not know how to assign these prob¬ 
abilities. For example, if a vaccine for a certain disease is proposed, 
we may be called upon to decide whether or not the vaccine should 
be used. We may decide that we could make the decision if we could 
compare the probability that a person vaccinated will get the disease 
with the probability that a person not vaccinated will get the disease. 
Statistical theory develops methods to obtain from experiments some 
information which will aid in estimating these probabilities, or will 
otherwise help in making the required decision. We shall illustrate a 
typical procedure. 

Smith claims that he has the ability to distinguish ale from beer 
and has bet Jones a dollar to that effect. Now t Smith does not mean 
that he can distinguish beer from ale with 100 per cent accuracy, but 
rather that he believes that he can distinguish them a proportion of 
the time which is significantly greater than 

Assume that it is possible to assign a number p which represents 
the probability that Smith can pick out the ale from a pair of glasses, 
one containing ale and one beer. We identify p = \ with his having 
no ability, p > \ with his having some ability, and p < \ with his 
being able to distinguish, but having the wrong idea which is the ale. 
If we knew the value of p, we would award the dollar to Jones if p 
were < |, and to Smith if p were > §. As it stands, we have no knowl¬ 
edge of p and thus cannot make a decision. We perform an experi¬ 
ment and make a decision as follows. 
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Smith is given a pair of glasses, one containing ale and the other 
beer, and is asked to identify which is the ale. This procedure is re¬ 
peated ten times, and the number of correct identifications is noted. 
If the number correct is at least eight, we award the dollar to Smith, 
and, if it is less than eight, we award the dollar to Jones. 

We now have a definite procedure and shall examine this procedure 
both from Jones’s and Smith’s points of view. We can make two 
kinds of errors. We may award the dollar to Smith when in fact the 
appropriate value of p is <§, or we may award the dollar to Jones 
when the appropriate value for p is >?. There is no way that these 
errors can be completely avoided. We hope that our procedure is such 
that each of the bettors will be convinced that, if he is right, he will 
very likely win the bet. 

Jones believes that the true value of p is §. We shall calculate the 
probability of Jones winning the bet if this is indeed true. We assume 
that the individual tests are independent of each other and all have 
the same probability § for success. (This assumption will be unrea¬ 
sonable if the glasses are too large.) We have then an independent 
trials process with p = \ to describe the entire experiment. The 
probability that Jones will win the bet is the probability that Smith 
gets fewer than eight correct. From the table in Figure 14 we com- 
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X X. 

0.1 

0.25 

0.50 

0.75 

0.90 

0 

.349 

.056 

.001 

.000 

.000 

1 

.387 

.188 

.010 

.000 

.000 

2 

.194 

.282 

.044 

.000 

.000 

3 

.057 

.250 

.117 

.003 

.000 

4 

.011 

.146 

.205 

.016 

.000 

5 

.001 

.058 

.246 

.058 

.001 

6 

.000 

.016 

.205 

.146 

.011 

7 

.000 

.003 

.117 

.250 

.057 

8 

.000 

.000 

.044 

.282 

.194 

9 

.000 

.000 

.010 

.188 

.387 

10 

.000 

.000 

.001 

.056 

.349 


Table of Values of f(10,x;p) 


Figure 14 
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pute that this probability is approximately .945. Thus Jones sees 
that, if he is right, it is very likely that he will win the bet. 

Smith, on the other hand, believes that p is significantly greater 
than f. If he believes that p is as high as .9, we see from Figure 14 
that the probability of his getting eight or more correct is .930. Then 
both men will be satisfied by the bet. 

Suppose, however, that Smith thinks the value of p is only about 
.75. Then the probability that he will get eight or more correct and 
thus win the bet is .526. There is then only an approximately even 
chance that the experiment will discover his abilities, and he probably 
will not be satisfied with this. If Smith really thinks his ability is 
represented by a p value of about f, we would have to devise a differ¬ 
ent method of awarding the dollar. We might, for example, propose 
that Smith win the bet if he gets seven or more correct. Then, if he 
has probability f of being correct on a single trial, the probability 
that he will win the bet is approximately .776. If p = f, the prob¬ 
ability that Jones will win the bet is about .828 under this new 
arrangement. Jones’s chances of winning are thus decreased, but 
Smith may be able to convince him that it is a fairer arrangement 
than the first procedure. 

In the above example, it was possible to make two kinds of errors. 
The probability of making these errors depended on the way we de¬ 
signed the experiment and the method we used for the required deci¬ 
sion. In some cases we are not too worried about the errors and can 
make a relatively simple experiment. In other cases, errors are very 
important, and the experiment must be designed with that fact in 
mind. For example, the possibility of error is certainly important in 
the case that a vaccine for a given disease is proposed, and the 
statistician is asked to help in deciding whether or not it should be 
used. In this case it might be assumed that there is a certain prob¬ 
ability p that a person will get the disease if not vaccinated, and a 
probability r that he will get it if he is vaccinated. If we have some 
knowledge of the approximate value of p, we are then led to construct 
an experiment to decide whether r is greater than p, equal to p, or 
less than p. The first case would be interpreted to mean that the 
vaccine actually tends to produce the disease, the second that it has 
no effect, and the third that it prevents the disease; so that we can 
make three kinds of errors. We could recommend acceptance when 
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it is actually harmful, we could recommend acceptance when it has 
no effect, or finally we could reject it when it actually is effective. 
The first and third might result in the loss of lives, the second in the 
loss of time and money of those administrating the test. Here it would 
certainly be important that the probability of the first and third kinds 
of errors be made small. To see how it is possible to make the prob¬ 
ability of both errors small, we return to the case of Smith and Jones. 

Suppose that, instead of demanding that Smith make at least eight 
correct identifications out of ten trials, we insist that he make at 
least 60 correct identifications out of 100 trials. (The glasses must 
now be very small.) Then, if p = j, the probability that Jones wins 
the bet is about .98; so that we are extremely unlikely to give the 
dollar to Smith when in fact it should go to Jones. (If p < i, it is 
even more likely that Jones will win.) If p > J, we can also calculate 
the probability that Smith will win the bet. These probabilities are 
shown in the graph in Figure 15. The dashed curve gives for com- 



Figure 15 

parison the corresponding probabilities for the test requiring eight 
out of ten correct. Note that with 100 trials, if p is f, the probability 
that Smith wins the bet is nearly 1, while in the case of eight out of 
ten, it was only about §. Thus in the case of 100 trials, it would be 
easy to convince both Smith and Jones that whichever one is correct 
is very likely to win the bet. 

Thus we see that the probability of both types of errors can be 
made small at the expense of having a large number of experiments. 
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EXERCISES 

1. Assume that in the beer and ale experiment Jones agrees to pay Smith 
if Smith gets at least nine out of ten correct. 

(a) What is the probability of Jones paying Smith even though Smith 

cannot distinguish beer and ale, and guesses? [Ans. .OIL] 

(b) Suppose that Smith can distinguish with probability .9. What is 
the probability of his not collecting from Jones? [Ans. .264.] 

2. Suppose that in the beer and ale experiment Jones wishes the prob¬ 
ability to be less than .1 that Smith will be paid if, in fact, he guesses. How 
many of ten trials must he insist that Smith get correct to achieve this? 

3. In the analysis of the beer and ale experiment, we assume that the 
various trials were independent. Discuss several ways that error can enter, 
because of the nonindependence of the trials, and how this error can be 
eliminated. (For example, the glasses in which the beer and ale were served 
might be distinguishable.) 

4. Consider the following two procedures for testing Smith's ability to 
distinguish beer from ale. 

(a) Four glasses are given at each trial, three containing beer and one 
ale, and he is asked to pick out the one containing ale. This pro¬ 
cedure is repeated ten times. He must guess correctly seven or more 
times. 

(b) Ten glasses are given him, and he is told that five contain beer and 
five ale, and he is asked to name the five which he believes contain 
ale. He must choose all five correctly. 

In each case, find the probability that Smith establishes his claim by guessing. 
Is there any reason to prefer one test over the other? 

[Ans. (a) .003; (b) .004.] 

5. A testing service claims to have a method for predicting the order in 

which a group of freshmen will finish in their scholastic record at the end of 
college. The college agrees to try the method on a group of five students, 
and says that it will adopt the method if, for these five students, the pre¬ 
diction is either exactly correct or can be changed into the correct order by 
interchanging one pair of adjacent men in the predicted order. If the method 
is equivalent to simply guessing, what is the probability that it will be 
accepted? [Ans. J*.] 

6. The standard treatment for a certain disease leads to a cure in } of the 
cases. It is claimed that a new treatment will result in a cure in f of the 
cases. The new treatment is to be tested on ten people having the disease. 
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If seven or more are cured the new treatment will be adopted. If three or 
fewer people are cured, the treatment will not be considered further. If 
the number cured is four, five, or six, the results will be called inconclusive, 
and a further study will be made. Find the probabilities for each of these 
three alternatives under the assumption first, that the new treatment has 
the same effectiveness as the old, and second, under the assumption that the 
claim made for the treatment is correct. 

7. Three students debate the intelligence of blonde dates. One claims that 
blondes are mostly (say 90 per cent of them) intelligent. A second claims 
that very few (say ten per cent) blondes are intelligent, while a third one 
claims that a blonde is just as likely to be intelligent as not. They administer 
an intelligence test to ten blondes, classifying them as intelligent or not. 
They agree that the first man wins the bet if eight or more are intelligent, 
the second if two or fewer, the third in all other cases. For each man, cal¬ 
culate the probability that he wins the bet, if he is right. 

[Ans. .930, .930, .890.] 

8. Ten men take a test with ten problems. Each man on each question 
has probability % of being right, if he does not cheat. The instructor deter¬ 
mines the number of students who get each problem correct. If he finds on 
four or more problems there are fewer than three or more than seven correct, 
he considers this convincing evidence of communication between the students. 
Give a justification for the procedure. [Hint: The table in Figure 14 must be 
used twice, once for the probability of fewer than three or more than seven 
correct answers on a given problem, and the second time to find the prob¬ 
ability of this happening on four or more problems.] 


*10. THE LAW OF LARGE NUMBERS 


In this section we shall study some further properties of the inde¬ 
pendent trials process with two outcomes. In Section 8 we saw that 
the probability for x successes in n trials is given by 





In Figure 16 we show these probabilities graphically for n = 8 and 
p = f. In Figure 17 we have done similarly for the case of n = 7 
and p = f. 

We see in the first case that the values increase up to a maximum 
value at x = 6 and then decrease. In the second case the values in¬ 
crease up to a maximum value at x = 5, have the same value for 
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x — 6, and then decrease, and these two cases are typical of what can 
happen in general. 



Consider the ratio of the probability of x + 1 successes in n trials 
to the probability of x successes in n trials, which is 



n — x 2 
x + 1 q 


This ratio will be greater than one as long as (n — x)p > (x + 1)# 
or as long as x < up — q. If np — q is not an integer, the values 



Figure 17 


0 


5 6 7 
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\^ x jp x q n ~ x increase up to a maximum value, which occurs at the first 
integer greater than np — q, and then decrease. In case np — q is an 
integer, the values (^)p x q n ~ x increase up to x — np — < 7 , are the same 


for x = np — q and x = np — q + 1, and then decrease. 

Thus we see that, in general, values near np will occur with the 
largest probability. It is not true that one particular value near np 
is highly likely to occur, but only that it is relatively more likely than 
a value further from np . For example, in 100 throws of a coin, 
np = 100 = 50. The probability of exactly 50 heads is approxi¬ 

mately .08. The probability of exactly 30 is approximately .00002. 

More information is obtained by studying the probability of 
a given deviation of the proportion of successes x/n from the num¬ 
ber p; that is, by studying for e greater than zero, the probability 
Pr [p — e < x/n < p + e]. 

For any fixed n, p, and e, the latter probability could be found 
by adding all the values of f(n,x]p) for values of x for which the in¬ 
equality p — e < x/n < p + e is satisfied. This would, for any par¬ 
ticular choice of n, p, and e, be a tedious task. However, it is proved 
in more advanced books that 


Pr [p — €<-<p + e]>l~^* 
1 n ne 


No matter how small e is, if we choose n large enough, we can make 

1 — as near to 1 as we wish. Thus the probability for the propor- 
ne 

tion of successes deviating from p by less than e can be made arbi¬ 
trarily near to 1 by choosing n large enough. The fact that this can 
be done is a special case of a very general theorem in probability 

theory called the law of large numbers. _ 

Let us put in the above inequality e = &V pq/n . Then we have 
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The quantity np is called the expected value for the number of 
successes, and the quantity V npq is called the standard deviation for 
the number of successes. We note that the probability of a deviation 
of more than k standard deviations from the expected value is less 
than or equal to 1/fc 2 . Thus for large k this probability will be small. 

It is possible to show by the more advanced theory that 

Pr[ — kV npq < x — np < kV npq] = Zk 

where z k is a number which can be found for any k and does not depend 
on n or on p. The symbol = means that the indicated probability 
is only approximately given by z k . The approximation improves with 
increasing n , but the error may be significant even for reasonably 
large n, and hence in practice this approximation must be used with 
care. 

We note that the approximation given above can also be inter¬ 
preted as stating that the probability that x — np is either greater 
than k'Vnpq or less than — kVnpq is approximately 1 — z k . In many 
applications one is interested only in the probability that x — np is 
greater than kV npq. It follows from the more advanced theory that 
this is approximately (1 — z k )/ 2 . Hence also the probability that 
x — np is less than —kVnpq is approximately (1 — z k )/ 2 . 

It is convenient to think of the standard deviation as a unit of 
measurement. In this case z k gives the approximate probability for 
a deviation of less than k units, or k standard deviations. The 
value of z k for k — 1, 2, and 3 are Zi = .683 . . ., z 2 = .956 . . . , 
z 3 — .997 .... Thus we see that it is very unlikely in a large number 
of trials to have a deviation from the expected value of more than 
three standard deviations. On the other hand z.i = .080 . . . which 
shows that it is quite unlikely that there will be a deviation of less 
than one-tenth of a standard deviation from the expected value. 

Example 1. In throwing an ordinary coin 10,000 times, the ex¬ 
pected number of heads is 5000, and the standard deviation for the 
number of heads is Vl0,000(^)(J) = 50. Thus the probability that 
the number of heads which turn up deviates from 5000 by less than 
one standard deviation, or 50, is approximately .683. The probability 
of a deviation of less than two standard deviations, or 100 , is approxi¬ 
mately .954. The probability of a deviation of less than three stand- 
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ard deviations, or 150, is approximately .997. On the other hand, the 
probability of a deviation of less than .1 standard deviation, or a 
deviation of less than five, is approximately .080. The statement that 
the number of heads deviates from 5000 by less than 150 is equivalent 
to the statement that the proportion of heads deviates from .5 by less 
than 150/10,000 = .015. 

Example 2. Assume that in a certain large city, 900 people are 
ehosen at random and asked if they favor a certain proposal. Of the 
900 asked, 550 say they favor the proposal and 350 are opposed. If, 
in fact, the people in the city are equally divided on the issue, would it 
be unlikely that such a large majority would be obtained in a sample 
of 900 of the citizens? If the people were equally divided, we would 
assume that the 900 people asked would form an independent trials 
process with probability | for a “yes” answer and | for a “no” 
answer. Then the standard deviation for the number of “yes” an¬ 
swers in 900 trials is V900(£)(i) = 15. Then it would be very un¬ 
likely that we would obtain a deviation of more than 45 from the 
expected number of 450. The fact that the deviation in the sample 
from the expected number was 100, then, is evidence that the hy¬ 
pothesis that the voters were equally divided is incorrect. The as¬ 
sumption that the true proportion is any value less than \ would also 
lead to the fact that a number as large as 550 favoring in a sample 
•of 900 is very unlikely. Thus we are led to suspect that the true 
proportion is greater than On the other hand, if the number who 
favored the proposal in the sample of 900 were 465, we would have 
only a deviation of one standard deviation, under the assumption of 
an equal division of opinion. Since such a deviation is not unlikely, 
we could not rule out this possibility on the evidence of the sample. 


EXERCISES 

1. If an ordinary die is thrown 20 times, what is the expected number of 

times that a 6 will turn up? What is the standard deviation for the number of 
6’s that turn up? [Ans. f.] 

2. Suppose that an ordinary die is thrown 450 times. What is the ex¬ 
pected number of throws that result in either a 3 or a 4? What is the standard 
deviation for the number of such throws? 
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3. In 16 tosses of an ordinary coin, what is the expected number of heads 

that turn up? What is the standard deviation for the number of heads that 
occur? [ Ans . 8 ; 2 .] 

4 . In 16 tosses of a coin, fin d the exact probability that the number of 
heads that turn up differs from the expected number by (a) less than one 
standard deviation, and (b) by not more than one standard deviation. Do 
the same for the case of two standard deviations, and for the case of three 
standard deviations. Show that the approximations given for large n lie 
between the values obtained, but are not very accurate for so small an n. 

[Ans. .546, .790; .923, .979; .996, .999.] 

5. Consider n independent trials with probability p for success. Let r 
and $ be numbers such that p < r < s. What does the law of large numbers 
say about 

Prj^r < | < «] 

as we increase n indefinitely? Answer the same question in the case that 
r < p < s. 

6 . A drug is known to be effective in 20 per cent of the cases where it is 
used. A new agent is introduced, and in the next 900 times the drug is used 
it is effective 250 times. What can be said about the effectiveness of the drug? 

7. In a large number of independent trials with probability p for success, 

what is the approximate probability that the number of successes will deviate 
from the expected number by more than one standard deviation but less 
than two standard deviations? [Ans. .271.] 

8 . What is the approximate probability that, in 10,000 throws of an 
ordinary coin, the number of heads which turn up lies between 4850 and 
5150? What is the probability that the number of heads lies in the same 
interval, given that in the first 1900 throws there were 1600 heads? 

9. Suppose that it is desired that the probability be approximately .95 
that the fraction of 6 ’s that turn up when a die is thrown n times does not 
deviate by more than .01 from the value J. How large should n be? 

[Ans. Approximately 5555.] 

10. Two railroads are competing for the passenger traffic of 1000 passengers 

by operating similar trains at the same hour. If a given passenger is equally 
likely to choose one train as the other, how many seats should the railroad 
provide if it wants to be sure that its seating capacity is sufficient in 99 out 
of 100 cases? [Ans. 547.] 

11. Assume that 10 per cent of the people in a certain city have cancer. 
If 900 people are selected at random from the city, what is the expected 
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number which will have cancer? What is the standard deviation? What is 
the approximate probability that more than 108 of the 900 chosen have 
cancer? [Ans. 90; 9; .023.] 

12 . Suppose that in Exercise 11 , the 900 people are chosen at random from 
those people in the city who smoke. Under the hypothesis that smoking has 
no effect on the incidence of cancer, what is the expected number in the 900 
chosen that have cancer? Suppose that more than 120 of the 900 chosen 
have cancer, what might be said concerning the hypothesis that smoking has 
no effect on the incidence of cancer? 

13. In Example 2 , we made the assumption in our calculations that, if the 
true proportion of voters in favor of the proposal were p, then the 900 people 
chosen at random represented an independent trials process with probability 
p for a “yes” answer, and 1 — p for a “no” answer. Give a method for 
choosing the 900 people which would make this a reasonable assumption. 
Criticize the following methods: 

(a) Choose the first 900 people in the list of registered Republicans. 

(b) Choose 900 names at random from the telephone book. 

(c) Choose 900 houses at random and ask one person from each house, 
the houses being visited in the mid-morning. 

14. For n throws of an ordinary coin, let t n be such that 

pr [-'- < I -1 < *-] - •" 7 

where x is the number of heads that turn up. Find t n for n = 10 4 , n = 10 6 , 
and n = 10 20 . [. Ans . .015; .0015; .000,000,000,15.] 

15. Assume that a calculating machine carries out a million operations to 
solve a certain problem. In each operation the machine gives the answer 10 -5 
too small, with probability §, and 10 ~ 5 too large, with probability J. Assume 
that the errors are independent of one another. What is a reasonable accuracy 
to attach to the answer? What if the machine carries out 10 9 operations? 

[Ans. ± .01; ±1.] 

*11. INDEPENDENT TRIALS WITH MORE THAN TWO OUTCOMES 

By extending the results of Section 8, we shall study the case of 
independent trials in which we allow more than two outcomes. We 
assume that we have an independent trials process where the possible 
outcomes are a h a 2 , . . ., a k , occurring with probabilities p h p 2 , ... , 
p k , respectively. We denote by /(n,r 2 , . . ., r k ypi,p 2 , • • • , Pk) the 
probability that, in n = n + r 2 Ar . . . + r k such trials, there will be 
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r\ occurrences of ai, r 2 of at, etc. In the case of two outcomes this 
notation would be/(ri,r 2 ;pi,p 2 ). In Section 8 we wrote this as/(n/i;pi) 
since r 2 and p 2 are determined from n , ri, and p\. We shall indicate 
how this probability is found in general, but carry out the details only 
for a special case. We choose k — 3, and n — 5 for purposes of illus¬ 
tration. We shall find/(l,2,2;pi,p 2 ,p 3 ). 

We show in Figure 18 enough of the tree for this process to indicate 
the branch probabilities for a path (heavy lined) corresponding to 
the outcomes a 2 , a 3 , ai, a 2 , a 3 . The tree measure assigns weight 
P 2 -pz'Vi'P 2 -pz = pi-pl'pl to this path. 



There are, of course, other paths through the tree corresponding 
to one occurrence of ai, two of a 2 and two of a 3 . However, they would 
all be assigned the same weight, pi • p\ • p \, by the tree measure. Hence 
to find/(l,2,2;pi,p 2 ,p 3 ), we must multiply this weight by the number 
of paths having the specified number of occurrences of each outcome. 

We note that the path a 2 , u 3 , a ly a 2 , a 3 can be specified by the three¬ 
cell partition [{3}, {1,4}, {2,5}] of the numbers from 1 to 5. Here 
the first cell shows the experiment which resulted in ai, the second 
cell shows the two that resulted in a 2 , and the third shows the two 
that resulted in a 3 . Conversely, any such partition of the numbers 
from 1 to 5 with one element in the first cell, two in the second, and 
two in the third corresponds to a unique path of the desired kind. 
Hence the number of paths is the number of such partitions. But 
this is 
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(see Chapter 3, Section 5), so that the probability of one occurrence 
of di, two of a 2 , and two of a 3 is 

The above argument carried out in general leads, for the case of 
independent trials with outcomes a h a 2 , . . . , a* occurring with prob¬ 
abilities pi y p 2 , . . . , Pk, to the following. 

The probability for n occurrences of di, r 2 occurrences of a 2 , etc., is 
given by 

f(ri,r 2 , . . . , r k ;p h p 2 , . . . , p k ) = (^^ ” _ , ’Vi • • • Pi- 


Example 1. A die is thrown 12 times. What is the probability 
that each number will come up twice? Here there are six outcomes, 
1, 2, 3, 4, 5, 6 corresponding to the six sides of the die. We assign each 
outcome probability J. We are then asked for 

/(2,2,2,2,2,2;iiiiU) 

which is 




Example 2. Suppose that we have a repeated-trials process with 
four outcomes a h a 2 , a 3 , a 4 occurring with probability p h p 2 , pz , P4, 
respectively. It might be that we are interested only in the prob¬ 
ability that ri occurrences of ai and r 2 occurrences of a 2 will take place 
with no specification about the number of each of the other possible 
outcomes. To answer this question we simply consider a new experi¬ 
ment where the outcomes are a h a 2 , a 3 . Here a 3 corresponds to an 
occurrence of either a z or a 4 in our original experiment. The corre¬ 
sponding probabilities would be p h p 2 , and pz with pz = pz + ?>4. Let 
f z = n — (ri + r 2 ). Then our question is answered by finding the 
probability in our new experiment for r\ occurrences of a\, r 2 of a 2 , 
and r 3 of a 3 , which is 



V\ 'V2 ’ Vz * 


The same procedure can be carried out for experiments with any 
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number of outcomes where we specify the number of occurrences of 
such particular outcomes. For example, if a die is thrown ten times 
the probability that a one will occur exactly twice and a three exactly 
three times is given by 



EXERCISES 

1. Suppose that in a city 60 per cent of the population are Democrats, 

30 per cent are Republicans, and 10 per cent are Independents. What is the 
probability that if three people are chosen at random there will be one 
Republican, one Democrat, and one Independent voter? [Ans. .108.] 

2 . Three horses A, B, and C compete in four races. Assuming that each 

horse has an equal chance in each race, what is the probability that A wins 
two races and B and C win one each? What is the probability that the same 
horse wins all four races? [Ans. - 5 ^.] 

3. Assume that in a certain large college 40 per cent of the students are 
freshmen, 30 per cent are sophomores, 20 per cent are juniors, and 10 per cent 
are seniors. A committee of eight is chosen at random from the student body. 
What is the probability that there are equal numbers from each class on the 
committee? 

4. Let us assume that when a batter comes to bat, he has probability .6 
of being put out, .1 of getting a walk, .2 of getting a single, .1 of getting an 
extra base hit. If he comes to bat five times in a game, what is the prob¬ 
ability that 

(a) He gets two walks and three singles? [Arts. .0008.] 

(b) A walk, a single, an extra base hit (and is out twice)? 

[. Arts . .043.] 

(c) Has a perfect day (i.e., never out). [Ans. .010.] 

5. Assume that a single torpedo has a probability of sinking a ship, 

probability J of damaging it, and probability \ of missing. Assume further 
that two damaging shots sink the ship. Wliat is the probability that four 
torpedos will succeed in sinking the ship? [Ans. §j^.] 

6 . Jones, Smith, and Green live in the same house. The mailman has 
observed that Jones and Smith receive the same amount of mail on the 
average, but that Green receives twice as much as Jones (and hence also 
twice as much as Smith). If he has four letters for this house, what is the 
probability that each man receives at least one letter? 
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7. If three dice are thrown, find the probability that there is one six and 
two fives, given that all the outcomes are greater than three. [Ans. ^.] 

8. A man plays a tournament consisting of three games. In each game 
he has probability \ for a win, i for a loss, and f for a draw, independently 
of the outcomes of other games. To win the tournament he must win more 
games than he loses. What is the probability that he wins the tournament? 

9. Assume that in a certain course the probability that a student chosen 
at random will get an A is .1, that he will get a B is .2, that he will get a C is 
.4, that he will get a D is .2, and that he will get an E is .1. What distribution 
of grades is most likely in the case of four students? 

[Arts. One B, two C’s, one D.] 

10 . Let us assume that in a World Series game a batter has probability \ 
of getting no hits, J for getting one hit, \ for getting two hits, assuming that 
the probability of getting more than two hits is negligible. In a four-game 
World Series, find the probability that the batter gets: 

(a) Exactly two hits. 

(b) Exactly three hits. 

(c) Exactly four hits. 

(d) Exactly five hits. 

(e) Fewer than two hits or more than five. 

rj-MO 7 7 35 7 2 3 "I 

12. EXPECTED VALUE 

In this section we shall discuss the concept of expected value. 
Although it originated in the study of gambling games, it enters into 
almost any detailed probabilistic discussion. 

Definition. If in an experiment the possible outcomes are num¬ 
bers, a h a 2 , . . . , ah, occurring with probability p lf p 2 , . . . , Pk, then 
the expected value is defined to be 


E = a x pi + a 2 p 2 + . . . + a k p k . 

The term “expected value” is not to be interpreted as the value that 
will necessarily occur on a single experiment. For example, if a person 
bets $1 that a head will turn up when a coin is thrown, he expects to 
win $1 or to lose $1. His expected value is (l)(i) + ( — l)(i) = 0, 
which is not one of the possible outcomes. The term, expected value, 
had its origin in the following consideration. If we repeat an experi¬ 
ment with expected value E a large number of times, and if we expect 
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ai a fraction of the time, a 2 a fraction of the time, etc., then the 
average that we expect per experiment is E. In particular, in a gam¬ 
bling game E is interpreted as the average winning expected in a large 
number of plays. Here the expected value is often taken as the value 
of the game to the player. If the game has a positive expected value, 
the game is said to be favorable, if the game has expected value zero 
it is said to be fair, and if it has negative expected value it is described 
as unfavorable. These terms are not to be taken too literally, since 
many people are quite happy to play games that, in terms of expected 
value, are unfavorable. 

Example 1. For the first example of the application of expected 
value we consider the game of roulette as played at Monte Carlo. 
There are several types of bets which the gambler can make, and we 
consider two of these. 

The wheel has the number 0 and the numbers from 1 to 36 marked 
on equally spaced slots. The wheel is spun and a ball comes to rest 
in one of these slots. If the player puts a stake, say of $1, on a given 
number, and the ball comes to rest in this slot, then he receives from 
the croupier 36 times his stake, or $36. Thus for a payment of $1 his 
expected winning is ff- = .973. This can be interpreted to mean that 
in the long run he can expect to lose about 2.7 per cent of his stakes. 

A second way to play is the following. A player may bet on “red” 
or “black.” The numbers from 1 to 36 are evenly divided between 
the two colors. If a player bets on “red,” and a red number turns up, 
he receives twice his stake. If a black number turns up, he loses his 
stake. If 0 turns up, then the wheel is spun until it stops on a number 
different from 0. If this is black, the player loses; but if it is red, he 
receives only his original stake, not twice it. For this type of play, 
the gambler pays $1 for an expected winning of 

2(M) + 1(*) = W = .9865. 

In this case the player can expect to lose about 1.35 per cent of his 
stakes in the long run. Thus the expected loss in this case is only 
half as great as in the previous case. 

Example 2. A player rolls a die and receives a number of dollars 
corresponding to the number of dots on the face which turns up. 
What should the player pay for playing, to make this a fair game? 
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To answer this question, we note that the player wins 1, 2, 3, 4, 5 or 6 
dollars, each with probability Hence, his expected winning is 

1(1) + 2(i) + 3(|) + 4(A) + 5(|) + 6(|) = 3§. 

Thus if he pays $3.50, his expected winnings will be zero. 


Example 3. What is the expected number of successes in the case 
of four independent trials with probability | for success? We know 


that the probability of x successes is 




4 —% 

* Thus 


E = o -( J )( i ) # ( f ) 4 + + 2 - Q (|) 2 (!) 2 

+ 3-Q© 8 (f) 1 + 4-Q(A)<(f)° 

= 0 + ff + « + W+ W = *. 

In general, it can be shown that in n trials with probability p for 
success, the expected number of successes is np. 


Example 4. In the game of craps a pair of dice is rolled by one 
<of the players. If the sum of the spots shown is 7 or 11, he wins. 
If it is 2, 3, or 12, he loses. If it is another sum, he must continue 
rolling the dice until he either repeats the same sum or rolls a 7. In 
the former case he wins, in the latter he loses. Let us suppose that 
he wins or loses $1. Then the two possible outcomes are +1 and — 1. 
We will compute the expected value of the game. First we must find 
the probability that he will win. 

We represent the possibilities by a two-stage tree shown in Figure 
19. While it is theoretically possible for the game to go on indefinitely, 
we do not consider this possibility. This means that our analysis 
applies only to games which actually stop at some time. 

The branch probabilities at the first stage are determined by think¬ 
ing of the 36 possibilities for the throw of the two dice as being equally 
likely and taking in each case the fraction of the possibilities which 
correspond to the branch as the branch probability. The probabilities 
for the branches at the second level are obtained as follows. If, for 
example, the first outcome was a 4, then when the game ends, a 4 or 
7 must have occurred. The possible outcomes for the dice were 
{(3,1), (1,3), (2,2), (4,3), (3,4), (2,5), (5,2), (1,6), (6,1)}. Again we 
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consider these possibilities to be equally likely and assign to the 
branch considered the fraction of the outcomes which correspond to 
this branch. Thus to the 4 branch we assign a probability f = |. 
The other branch probabilities are determined in a similar way. 



4 W 

7 L 

5 W 

7 L 

6 W 

7 L 

8 W 

7 L 

9 W 

7 L 

10 W 
7 L 


Having the tree measure assigned, to find the probability of a win 
we must simply add the weights of all paths leading to a win. If 
this is done, we obtain Thus the player’s expected value is 
l'(fti) + = "“dh- = —.0141. Hence he can expect to 

lose 1.41 per cent of his stakes in the long run. It is interesting to 
note that this is just slightly less favorable than his losses in betting 
on “red” in roulette. 


EXERCISES 

1 . Suppose that A tosses 2 coins and receives $2 if two heads appear, $1 

if one head appears, and nothing if no heads appear. What is the expected 
value of the game to him? [Am. $1.] 

2. Smith and Jones are matching coins. If the coins match, Smith gets $1, 
and if they do not, Jones gets $1. 

(a) If the game consists of matching twice, what is the expected value 
of the game for Smith? 
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(b) Suppose that if Smith wins the first round he quits, and if he loses 
the first he plays the second. Jones is not allowed to quit. What 
is the expected value of the game for Smith? 

3 . If five coins are thrown, what is the expected number of heads that 

will turn up? [Arts, f.] 

4 . A coin is thrown until the first time a head comes up or until three 
tails in a row occur. Find the expected number of times the coin is thrown. 

5. A man wishes to purchase a five cent newspaper. He has in his pocket 
one dime and five pennies. The newsman offers to let him have the paper in 
exchange for one coin drawn at random from the customer’s pocket. 

(a) Is this a fair proposition and, if not, to whom is it favorable? 

[. Ans . Favorable to man.] 

(b) Answer the same questions as in (a) assuming that the newsman 
demands two coins drawn at random from the customer’s pocket. 

[Ans. Fair proposition.] 

6 . A bets 50 cents against B’s x cents that, if two cards are dealt from a 
shuffled pack of ordinary playing cards, both cards will be of the same color. 
What value of x will make this bet fair? 

7 . Prove that if the expected value of a given experiment is E, and if a 
constant c is added to each of the outcomes, the expected value of the new 
experiment is E + c. 

8. Prove that, if the expected value of a given experiment is E, and if 
each of the possible outcomes is multiplied by a constant k, the expected 
Value of the new experiment is k • E. 

9 . Referring to Example 2, Section 7, find the expected number of blocks 

the man will walk before reaching home. [Ans. f.] 

10. An urn contains two black and three white balls. Balls are successively 
drawn from the urn without replacement until a black ball is obtained. Find 
the expected number of draws required. 

11 . Using the result of Exercises 15, 16 of Section 7, find the expected 
number of games in the World Series (a) under the assumption that each 
team has probability \ of winning each game and (b) under the assumption 
that the stronger team has probability § of winning each game. 

[Ans. 5.81; 5.50.] 

12 . Suppose that we modify the game of craps as follows: On a 7 or 11 
the player wins $2, on a 2, 3, or 12 he loses $3; otherwise the game is as 
usual. Find the expected value of the new game, and compare it with the old 
value. 
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13. Suppose that in roulette at Monte Carlo we place 50 cents on “red” 
and 50 cents on “black.” What is the expected value on the game? Is this 
better or worse than placing $1 on “red”? 

14. Betting on “red” in roulette can be described roughly as follows. We 

win with probability .49, get our money back with probability .01, and lose 
with probability -50. Draw the tree for three plays of the game, and compute 
(to three decimals) the probability of each path. What is the probability 
that we are ahead at the end of three bets? [ Ans . .485.] 

15. Assume that the odds are r:s that a certain statement will be true. 
If a man receives s dollars if the statement turns out to be true, and gives 
r dollars if not, what is his expected winning? 

16. Referring to Exercise 9 of Section 3, find the expected number of 
languages that a student chosen at random reads. 

17. Referring to Exercise 5 of Section 4, find the expected number of men 

who get their own hats. [Ans. 1.] 

1 3. MARKOV CHAINS 

In this section we shall study a more general kind of process than 
the ones considered in the last three sections. 

We assume that we have a sequence of experiments with the fol¬ 
lowing properties. The outcome of each experiment is one of a finite 
number of possible outcomes a h a 2 , . . . , a r . It is assumed that the 
probability of outcome aj on any given experiment is not necessarily 
independent of the outcomes of previous experiments but depends at 
most upon the outcome of the immediately preceding experiment. 
We assume that there are given numbers pij which represent the 
probability of outcome aj on any given experiment, given that out¬ 
come a; occurred on the preceding experiment. The outcomes a h 
a, 2 , . . . , a r are called states, and the numbers p# are called transition 
probabilities . If we assume that the process begins in some particular 
state, then we have enough information to determine the tree measure 
for the process and can calculate probabilities of statements relating 
to the over-all sequence of experiments. A process of the above kind 
is called a Markov chain process. 

The transition probabilities can be exhibited in two different ways. 
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The first way is that of a square array. For a Markov chain with 
states ai 5 a 2 , and dz y this array is written as 


Vn 

Pl2 

Pl3\ 

[ Vn 

Pi 2 

P23 1 

w 

P.32 

P3 Z/ 


Such an array is a special case of a matrix. Matrices are of funda- 

mental importance to the study of Markov 
2 chains as w r ell as being important in the 
study of other branches of mathematics. 
I | They will be studied in detail in the next 

i \ y| chapter. 

y A second way to show the transition 

* a z ^ probabilities is by a transition diagram . 

f ) Such a diagram is illustrated for a special 

\^^y case in Figure 20. The arrows from each 

^ state indicate the possible states to which 

Figure 20 a P rocess can m °ve from the given state. 

The matrix of transition probabilities 
which corresponds to this diagram is the matrix 


a± a2 as 

ai /0 1 0\ 

P = a 2 I 0 \ i J. 

\§ 0 §/ 

An entry of 0 indicates that the transition is impossible. 

Notice that in the matrix P the sum of the elements of each row 
is 1. This must be true in any matrix of transition probabilities, since 
the elements of the itYi row represent the probabilities for all possibili¬ 
ties when the process is in state a*. 

The kind of problem in which we are most interested in the study 
of Markov chains is the following. Suppose that the process starts 
in state i. What is the probability that after n steps it will be in 
state ft We denote this probability by plf. Notice that we do not 
mean by this the nth power of the number p#. We are actually 
interested in this probability for all possible starting positions i 
and all possible terminal positions j. We can represent these num¬ 
bers conveniently again by a matrix. For example for n steps in a 
three-state Markov chain we write these probabilities as the matrix 
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p<»> = 



pg pS?\ 

P 22 V 2 S 

r® V®/ 


Example 1. Let us find for a Markov chain with transition prob¬ 
abilities indicated in Figure 20 the probability of being at the various 
possible states after three steps, 
assuming that the process starts 
at state 01 . We find these proba¬ 
bilities by constructing a tree 
and a tree measure as in Figure 
21 . 

The probability P 13 , for ex¬ 
ample, is the sum of the weights 
assigned by the tree measure to 
all paths through our tree which 
end at state a*. That is, + 1 -i-f — A- Similarly pi* = 

l-J-i = i and pg = !•!•! = i- By constructing a similar tree 
measure, assuming that we start at state a?, we could find P 21 , P 22 > 
and pg. The same is true for pg, pg, and pg. If this is carried out 
(see Exercise 7) we can write the results in matrix form as follows: 

di d2 dz 

di f i l Ts\ 

P® = IA * fi • 

a? Vrr ts H/ 

Again the rows add up to 1, corresponding to the fact that if we start 
at a given state we must reach some state after three steps. Notice 
now that all the elements of this matrix are positive, showing that 
it is possible to reach any state from any state in three steps. In the 
next chapter we will develop a simple method of computing P . 

Example 2. Suppose that we are interested in studying the way 
in which a given state votes in a series of national elections. We wish 
to make long-term predictions and so will not consider conditions 
peculiar to a particular election year. We shall base our predictions 
only on past history of the outcomes of the elections, Republican or 
Democratic. It is clear that a knowledge of these past results would 



Figure 21 
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influence our predictions for the future. As a first approximation, we 
assume that the knowledge of the past beyond the last election would 
not cause us to change the probabilities for the outcomes on the next 
election. With this assumption we obtain a Markov chain with two 
states R and D and matrix of transition probabilities 

R D 

R (\ — a a \ 

D \ b 1 - by 

The numbers a and b could be estimated from past results as follows. 
We could take for a the fraction of the previous years in which the 
outcome has changed from Republican in one year to Democratic in 
the next year, and for b the fraction of reverse changes. 

We can obtain a better approximation by taking into account the 
previous two elections. In this case our states are RR , RD, DR , and 
DD , indicating the outcome of two successive elections. Being in 
state RR means that the last two elections were Republican victories. 
If the next election is a Democratic victory, we will be in state RD. 
If the election outcomes for a series of years is DDDRDRR , then our 
process has moved from state DD to DD to DR to RD to DR, and 
finally to RR, Notice that the first letter of the state to which we 
move must agree with the second letter of the state from which we 
came, since these refer to the same election year. Our matrix of 
transition probabilities will then have the form, 

RR DR RD DD 

RR /l — a 0 a 0 \ 

DR b 0 1 - & 0 { 

RD I 0 1 - c 0 c 

DD \ 0 d 0 l-d) 

Again the numbers a, b, c , and d would have to be estimated. The 
study of this example is continued in Chapter V, Section 8. 

Example 3. The following example of a Markov chain has been 
used in physics as a simple model for diffusion of gases. We shall see 
later that a similar model applies to an idealized problem in changing 
populations. 

We imagine n black balls and n white balls which are put into two 
urns so that there are n balls in each urn. A single experiment con¬ 
sists in choosing a ball from each urn at random and putting the ball 
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obtained from the first urn into the second urn, and the ball obtained 
from the second urn into the first. We take as state the number of 
black balls in the first urn. If at any time we know this number, then 
we know the exact composition of each urn. That is, if there are j 
black balls in urn 1 , there must ben - j black balls in urn 2 , n - j 
white balls in urn 1 , and j white balls in urn 2 . If the process is in 
stated, then after the next exchange it will be in state j — 1 , if a black 
ball is chosen from urn 1 and a white ball from urn 2. It will be in 
state j if a ball of the same color is drawn from each urn. It will be in 
state j + 1 if a white ball is drawn from urn 1 and a black ball from 
urn 2. The transition probabilities are then given by (see Exercise 12): 


Pn- 1 

1 ! 

£ i^>. 

j > o 

Pn 

2 j(n - j) 
n? 



(n - A 2 

j <n 

Pn+ 1 

= ( » ) 

Pjk 

= 0 

otherwise. 


A physicist would be interested, for example, in predicting the com¬ 
position of the urns after a certain number of exchanges have taken 
place. Certainly any predictions about the early stages of the process 
would depend upon the initial composition of the urns. For example, 
if we started with all black balls in urn 1 , we would expect that for 
some time there would be more black balls in urn 1 than in urn 2 . 
On the other hand, it might be expected that the effect of this initial 
distribution would wear off after a large number of exchanges. We 
shall see later, in Chapter V, Section 8 , that this is indeed the case. 

EXERCISES 

1 . Draw a state diagram for the Markov chain with transition proba¬ 
bilities given by the following matrices. 
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2 . Give the matrix of transition probabilities corresponding to the follow¬ 
ing transition diagrams. 



3. Find the matrix P (2) for the Markov chain determined by the matrix 
of transition probabilities 



4. What is the matrix of transition probabilities for the Markov chain 
in Example 3, for the case of 2 white balls and 2 black balls? 

5. Find the matrices P <2) , P (3) , P (4) for the Markov chain determined by 
the transition probabilities 

(i!> 

Find the same for the Markov chain determined by the matrix 

c ;> 

6 . Suppose that a Markov chain has two states, a x and a 2 and transition 
probabilities given by the matrix 


By means of a separate chance device we choose a state in which to start 
the process. This device chooses a x with probability i and a 2 with probability 
Find the probability that the process is in state a x after the first step. 
Answer the same question in the case that the device chooses a x with prob¬ 
ability | and a 2 with probability f. [Arts. A; £.] 

7. Referring to the Markov chain with transition probabilities indicated 
in Figure 20, construct the tree measures and determine the values of 

Pai, P 22 , Pm, and p% pf 2 , pf 3 
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8 . A certain calculating machine uses only the digits 0 and 1 . It is sup¬ 
posed to transmit one of these digits through several stages. However, at 
every stage there is a probability p that the digit which enters this stage 
will be changed when it leaves. We form a Markov chain to represent the 
process of transmission by taking as states the digits 0 and 1. What is the 
matrix of transition probabilities? 

9. For the Markov chain in Exercise 8 , draw a tree and assign a tree 
measure, assuming that the process begins in state 0 and moves through three 
stages of transmission. What is the probability that the machine after three 
stages produces the digit 0, i.e., the correct digit? What is the probability 
that the machine never changed the digit from 0 ? 

10. Assume that a man’s profession can be classified as professional, 

skilled laborer, or unskilled laborer. Assume that of the sons of professional 
men 80 per cent are professional, 10 per cent are skilled laborers, and 10 
per cent are unskilled laborers. In the case of sons of skilled laborers, 60 
per cent are skilled laborers, 20 per cent are professional, and 20 per cent are 
unskilled laborers. Finally, in the case of unskilled laborers, 50 per cent of 
the sons are unskilled laborers, and 25 per cent each are in the other two 
categories. Assume that every man has a son, and form a Markov chain by 
following a given family through several generations. Set up the matrix of 
transition probabilities. Find the probability that the grandson of an un¬ 
skilled laborer is a professional man. [Ans. .375.] 

11 . In Exercise 10 we assumed that every man has a son. Assume instead 
that the probability a man has a son is . 8 . Form a Markov chain with four 
states. The first three states are as in Exercise 10 , and the fourth state 
is such that the process enters it if a man has no son, and that the state 
cannot be left. This state represents families whose male line has died out. 
Find the matrix of transition probabilities and find the probability that an 
unskilled laborer has a grandson who is a professional man. [Ans. .24.] 

12 . Explain why the transition probabilities given in Example 3 are correct. 
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VECTORS AND MATRICES 


1. COLUMN AND ROW VECTORS 

A column vector is an ordered collection of numbers written in a 
column. Examples of such vectors are 



The individual numbers in these vectors are called components , and 
the number of components a vector has is one of its distinguishing 
characteristics. Thus the first two vectors above have two compo¬ 
nents; the next two have three components; and the last has four 
components. When talking more generally about n-eomponent col¬ 
umn vectors we shall write 



Analogously, a row vector is an ordered collection of numbers written 
in a row. Examples of row vectors are 

(1,0), (-2,1), (2,-3,4,0), (-1,2,-3,4,-5). 
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Each number appearing in the vector is again called a component of 
the vector, and the number of components a row vector has is again 
one of its important characteristics. Thus, the first two examples are 
two-component, the third a four-component, and the fourth a five- 
component vector. The vector v = ( v h v 2 , . . . , v n ) is an n-component 
row vector. 

Two row vectors, or two column vectors, are said to be equal if 
and only if corresponding components of the vector are equal. Thus 
for the vectors 

u = (1,2), V = Q> w = (1,2), x = (2,1), 

we see that u — w but u ^ v, and u ^ x. 

If u and v are three-component column vectors, we shall define 
their sum u + v by component-wise addition as follows: 

0 h\ / u i + v \ 

+ U ) = I U 2 + V‘2 j. 

Vs/ Vs + IhJ 

Similarly, if u and v are three-component row vectors, their sum is 
defined to be 

u + v = (u h u 2 ,u z ) + ( Vi,V 2y V Z ) 

= (u 1 +v h u 2 +v 2) Uz+v z ). 

Note that the sum of two three-component vectors yields another 
three-component vector. For example, 



and 

(4,-7,12) + (3,14,-14) = (7,7,-2). 

The sum of two n-component vectors (either row or column) is 
defined by component-wise addition in an analogous manner, and 
yields another n-component vector. Observe that we do not define 
the addition of vectors unless they are both row or both column 
vectors, having the same number of components. 

Because the order in which two numbers are added is immaterial 




180 


VECTORS AND MATRICES 


[Chap. V 


as far as the answer goes, it is also true that the order in which 
vectors are added does not matter; that is, 

U + v = V + u 

where u and v are both row or both column vectors. This is the so- 
called commutative law of addition . A numerical example is 



Once we have the definition of the addition of two vectors we can 
easily see how to add three or more vectors by grouping them in pairs 
as in the addition of numbers. For example, 



and 

(1,0,0) + (0,2,0) + (0,0,3) = (1,2,0) + (0,0,3) = (1,2,3) 

= (1,0,0) + (0,2,3) = (1,2,3). 

In general, the sum of any number of vectors (row or column), each 
having the same number of components, is the vector whose first 
component is the sum of the first components of the vectors, whose 
second component is the sum of the second components, etc. 

The multiplication of a number a times a vector v is defined by 
component-wise multiplication of a times the components of v. For 
the three-component case we have 



for column vectors and 


av = a(v ijV 2 ,vf) — (av h av 2 ,avf) 
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for row vectors. If u is an n -component vector (row or column), then 
au is defined similarly by component-wise multiplication. 

If u is any vector we define its negative — u to be the vector 
—u - ( —1 )u. Thus in the three-component case for row vectors we 
have 

— u = (— V){Ui y U 2 ,ui) = ( — Wi,— ^ 2 ,-^ 3 ). 

Once we have the negative of a vector it is easy to see how to subtract 
vectors, i.e., we simply add “algebraically.” For the three-component 
column vector case we have 



Specific examples of subtraction of vectors occur in the exercises at 
the end of this section. 

An important vector is the zero vector all of whose components are 
zero. For example, three-component zero vectors are 

/(ft 

0 = (o) and 0 = (0,0,0). 

W 

When there is no danger of confusion we shall use the symbol 0, as 
above, to denote the zero (row or column) vector. The meaning will 
be clear from the context. The zero vector has the important property 
that, if u is any vector, then u + 0 = u. A proof for the three- 
component column vector case is as follows: 

0 /O\ fui + 0\ /uA 

+ loJ = |w2 + Oj = lte2j = w. 

\0/ \uz + 0/ W 

One of the chief advantages of the vector notation is that one can 
denote a whole collection of numbers by a single letter such as u f 
v , . . . , and treat such a collection as if it were a single quantity. 
By using the vector notation it is possible to state very complicated 
relationships in a simple manner. The student will see many examples 
of this in the remainder of the present chapter and the two succeeding 
chapters. 
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EXERCISES 

]. Compute the quantities below for the vectors 



(a) 2 u. 

(b) -c. 

(c) 2 u — v. 

(d) v + w. 

(e) u + v — w. 

(f) 2 u — 3v — w. 

(g) 3u — v + 2 w. 


[Ans. 



.] 




2 . Compute (a) through (g) of Exercise 1 if the vectors u, v, and w are 

u = (7,0,-3), v - ( 2 , 1 ,-5), ic = ( 1 , — 1 , 0 ). 

3 . (a) Show that the zero vector is not changed when multiplied by any 

number. 

(b) If u is any vector, show that 0 + u = u. 

4. If u and v are two row or two column vectors having the same number 
of components, prove that u + 0 i> = u and 0 u + v = v. 

5. If 2 u — v = 0, what is the relationship between the components of u 

and those of v? [Arts. Vi = 2 

6 . Answer the question in Exercise 5 for the equation — 3u + 5v + u 
— 7v = 0. Do the same for the equation 20c — 3?/ + 5c + Su = 0. 

7. When possible compute the following sums; when not possible give 
reasons. 



(b) (2,-1,-1) +0(4,7,-2) = ? 
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(c) (5,6) + 7 - 21 + 




+ 3 


= ? 


8 . If 



+ 




find Mi, « 2 , and u 3 . 


[Am. 0; -2; -2.] 


M M 

9. If 2 I 1 — I 1 I, find the components of 

w w 

0 / aA / ()\ 

+ I ^2 j = I 0 ], what can be said concerning the components 

\uj \ 0 / 

Uu U2i Uzl 


11 . If 0-1 ) = ( 0 I, what can be said concerning the components u lf 

W \0 / 

u 2 , u£ 


12 . Suppose that we associate with each person a three-component row 
vector having the following entries: age, height, and weight. Would it make 
sense to add together the vectors associated with two different persons? 
Would it make sense to multiply one of these vectors by a constant? 


13. Suppose that we associate with each person leaving a supermarket a 
row vector whose components give the quantities of each available item that 
he has purchased. Answer the same questions as those in Exercise 12. 


14. Let us associate with each supermarket a column vector whose entries 
give the prices of each item in the store. Would it make sense to add together 
the vectors associated with two different supermarkets? Would it make sense 
to multiply one of these vectors by a constant? Discuss the differences in 
the situations given in Exercises 12, 13, and 14. 


2. THE PRODUCT OF VECTORS; EXAMPLES 

The reader may have wondered why it was necessary to introduce 
both column and row vectors when their properties are so similar. 
This question can be answered in several different ways. In the first 
place, in many applications there are two kinds of quantities which 
are studied simultaneously, and it is convenient to represent one of 
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them as a row vector and the other as a column vector. Second, there 
is a way of combining row and column vectors that is very useful for 
certain types of calculations. To bring out these points let us look 
at the following simple economic example. 

Example 1. Suppose a man named Smith goes into a grocery store 
to buy a dozen each of eggs and oranges, a half dozen each of apples 
and pears, and three lemons. Let us represent his purchases by means 
of the following row vector: 

X = [6 (apples), 12 (eggs), 3 (lemons), 12 (oranges), 6 (pears)] 

= (6,12,3,12,6). 

Suppose that apples are 4 cents each, eggs are 6 cents each, lemons 
are 9 cents each, oranges are 5 cents each, and pears are 7 cents each. 
We can then represent the prices of these items as a column vector 


/A 

cents 

per 

apple 

6 

cents 

per 

egg 

9 

cents 

per 

lemon 

5 

cents 

per 

orange 

W 

cents 

per 

pear. 


The obvious question to ask now is, what is the total amount that 
Smith must pay for his purchases? What we would like to do is to 
multiply the quantity vector x by the price vector y, and we would 
like the result to be Smith’s bill. We see that our multiplication 
should have the following form: 


x-y 


/A 


(6,12,3,12,6) 


6 

9 

5 


V/ 


6-4 + 12-6 + 3-9 + 12-5 + 6-7 
24 + 72 + 27 + 60 + 42 


225 cents or $2.25. 


This is, of course, the computation that the cashier performs in figur¬ 
ing Smith’s bill. 
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We shall adopt in general the above definition of multiplication of 
row times column vectors. 

Definition. Let u be a row vector and v a column vector each 
having the same number n of components; then we shall define the 
product u-v to be 

U'V = U\V 1 + U 2 V 2 + • • • + UnVn. 

Notice that we always write the row vector first and the column 
vector second, and this is the only kind of vector multiplication that 
we consider. Some examples of vector multiplication are given below: 

2-3 + l-(-l) + (-l)-4 = 1. 

1 - 0 + 0-1 = 0 + 0 = 0 . 

Note that the result of vector multiplication is always a number. 

Example 2. Consider an oversimplified economy which has three 
industries which we call coal, electricity, and steel, and three con¬ 
sumers 1, 2, and 3. Suppose that each consumer uses some of the 
output of each industry and also that each industry uses some of the 
output of each other industry. We assume that the amounts used are 
positive or zero, since using a negative quantity has no immediate 
interpretation. We can represent the needs of each consumer and 
industry by a three-component demand (row) vector, the first com¬ 
ponent measuring the amount of coal needed by the consumer or 
industry; the second component the amount of electricity needed; 
and the third component the amount of steel needed, in some con¬ 
venient units. For example, the demand vectors of the three con¬ 
sumers might be 

di = (3,2,5), d 2 = (0,17,1), d z = (4,6,12); 
and the demand vectors of each of the industries might be 
d c = (0,1,4), d E = (20,0,8), d s = (30,5,0), 
where the subscript C stands for coal; the subscript E, for electricity; 
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and the subscript S, for steel. Then the total demand for these goods 
by the consumers is given by the sum 

<k + d 2 + d z = (3,2,5) + (0,17,1) + (4,6,12) = (7,25,18). 

Also, the total industrial demand for these goods is given by the sum 

dc + d E + ds = (0,1,4) + (20,0,8) T - (30,5,0) = (50,6,12). 

Therefore the total over-all demand is given by the sum 


(7,25,18) + (50,6,12) = (57,31,30). 

Suppose now that the price of coal is $1 per unit, the price of elec¬ 
tricity is $2 per unit, and the price of steel is $4 per unit. Then these 
prices can be represented by the column vector 

V = 



Consider the steel industry: it sells a total of 30 units of steel at $4 
per unit so that its total income is $120. Its bill for the various goods 
is given by the vector product 


d s -V = (30,5,0) 



= 30 + 10 = $40. 


Hence the profit of the steel industry is $120 — $40 = $80. In the 
exercises below the profits of the other industries will be found. 



This model of an economy is 
unrealistic in two senses. First we 
have not chosen realistic numbers 
for the various quantities involved. 
Second, and more important, we 
have neglected the fact that the 
more an industry produces the 
more inputs it requires. The latter 
complication will be introduced in 
Chapter VII. 

Example 3. Consider the rec¬ 
tangular coordinate system in the 
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plane shown in Figure 1. A two-component row vector x = (a,b) can be 
regarded as a point in the plane located by means of the coordinate 
axes as shown. The point x can be found by starting at the origin 
of coordinates 0 and moving a distance a along the X\ axis; then 



Figure 2 

moving a distance b along a line parallel to the x 2 axis. If we have 
two such points, say x = (a, b) and y = ( c,d), then the points x + y, 
—x, —y } x — y, y — x, — x — y have the geometric significance 
shown in Figure 2. 



Figure 3 Figure 4 
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The idea of multiplying a row vector by a number can also be given 
a geometric meaning, see Figure 3. There we have plotted the point 
corresponding to the vector x = (1,2) and 2x, \x, —x, and —2x. 
Observe that all these points lie on a line through the origin of co¬ 
ordinates. Another vector quantity which has geometrical signifi¬ 
cance is the vector z = ax + (1 — a)y, where a is any number 
between 0 and 1. Observe in Figure 4 that the points z all lie on the 
line segment between the points x and y. If a = j the corresponding 
point on the fine segment is the mid-point of the segment. Thus, if 
x = ( a,b ) and y = ( c,d ) then the point 

\x + \y = Ua,b) + |(c,d) 
a + c b + d \ 

2 ’ 2 ) 

is the mid-point of the line segment between x and y. 

EXERCISES 

1. Compute the quantities below for the following vectors: 

u = (1,—1,4), * = (0,1,2), 

’ ■ (?) v ■ h) 

(a) u-v + x-y = ? 

(b) (-u + 5x)-(3v-2y) = ? 

(c) 5 u-v + 10[x-(2w — y)} = ? 

(d) 2[(w - x) ■ (v + y)] = ? 

2. Plot the points corresponding to the row vectors x = (3,4) and y = 
(—2,7). Then compute and plot the following vectors. 

(a) §x + iy. 

(b) x + y. 

(c) x - 2 y. 

(d) Jx + \y. 

(e) 3x - 2 y. 

(f) 4i/-3*. 

3. If x = (1, —1,2) and y = (0,1,3) are points in space, what is the mid¬ 
point of the line segment joining x to y? [Ans. (i,0,f).] 

4. If u is a three-component row vector and v is a three-component column 



[Aas. 12.] 
[Ans. 55.] 
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vector having the same number of components, and a is a number, prove 
that a(u-v) = (au)'V = u-{av). 

5. Suppose that Brown, Jones, and Smith go to the grocery store and 
purchase the following items: 

Brown: two apples, six lemons, and five pears; 

Jones: two dozen eggs, two lemons, and two dozen oranges; 

Smith: ten apples, one dozen eggs, two dozen oranges, and a half dozen 
pears. 

(a) How many different kinds of items did they purchase? [Ans. 5.] 

(b) Write each of their purchases as row vectors with as many com¬ 
ponents as the answer found in (a). 

(c) Using the price vector given in Example 1, compute each man’s 

grocery bill. [Ans. $0.97; $2.82; $2.74.] 

(d) By means of vector addition find the total amount of their pur¬ 
chases as a row vector. 

(e) Compute in two different ways the total amount spent by the 

three men at the grocery store. [Ans. $6.53.] 

6. Prove that vector multiplication satisfies the following two properties 

(i) u-(av) = a(u'v) 

(ii) u • (v + w) = U'V + u-w 

where u is a three-component row vector, v and w are three-component 
column vectors, and a is a number. 

7. The production of a book involves several steps: first it must be set in 
type, then it must be printed, and finally it must be supplied with covers and 
bound. Suppose that the typesetter charges $6 an hour, paper costs i cent 
per sheet, that the printer charges 11 cents for each minute that his press 
runs, that the cover costs 28 cents, and that the binder charges 15 cents to 
bind each book. Suppose now that a publisher wishes to print a book that 
requires 300 hours of work by the typesetter, 220 sheets of paper per book, 
and 5 minutes of press time per book. 

(a) Write a five-component row vector which gives the requirements 
for the first book. Write another row vector which gives the re¬ 
quirements for the second, third, . . . copies of the book. Write 
a five-component column vector whose components give the prices 
of the various requirements for each book, in the same order as 
they are listed in the requirement vectors above. 

(b) Using vector multiplication, find the cost of publishing one copy 

of a book. [Ans. $1,801.53.] 

(c) Using vector addition and multiplication, find the cost of printing 

a first edition run of 5000 copies. [Ans. $9,450.] 
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(d) Assuming that the type plates from the first edition are used 
again, find the cost of printing a second edition of 5000 copies. 

[Arcs. $7,650.] 

8. Perform the following calculations for Example 2. 

(a) Compute the amount that each industry and each consumer has 
to pay for the goods it receives. 

(b) Compute the profit made by each of the industries. 

(c) Find the total amount of money that is paid out by all the in¬ 
dustries and consumers. 

(d) Find the proportion of the total amount of money found in (c) 
paid out by the industries. Find the proportion of the total money 
that is paid out by the consumers. 

9. A building contractor has accepted orders for five ranch style houses, 

seven Cape Cod houses, and twelve Colonial style houses. Write a three- 
component row vector x whose components give the numbers of each type 
of house to be built. Suppose that he knows that a ranch style house requires 
20 units of wood; a Cape Cod, 18 units; and a Colonial style, 25 units of 
wood. Write a column vector u whose components give the various quantities 
of wood needed for each type of house. Find the total amount of wood needed 
by computing the matrix product xu. [Ans. 526.] 

10. Let x = (; x\,x 2 ) and let a and b be the vectors 

* ■ (O' 1 ■ 0 

If x-a = —1 and x-b = 7 determine xi and x 2 . [Ans. xi = —31; x 2 = 23.] 

11. Let x = ( x h x 2 ) and let a and b be the vectors 



If x-a — x\ and x • b = x 2 , determine x x and x 2 . 


3. MATRICES AND THEIR COMBINATION WITH VECTORS 

A matrix is a rectangular array of numbers written in the form 


dll 

di2 • - 

. . a ln \ 

d 21 

a 22 . < 

. . d 2n 

fl'ml 

dm2 • • 

■ • d mn J 


Here the letters a# stand for real numbers and m and n are integers. 
Observe that m is the number of rows and n is the number of columns 
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of the matrix. For this reason we call it an m X n matrix. If m — n 
the matrix is square . The following are examples of matrices. 


(1,2,3), 

r l 0 0 0\ 
0 10 0 
0 0 10 
,0 0 0 1 / 



'1 7-8 9 10\ 

3-1 14 2 -6 

,0 3 -5 7 0/ 


The first example is a row vector which is a 1 X 3 matrix; the second 
is a column vector which is a 3 X 1 matrix; the third example is a 
2X2 square matrix; the fourth is a 4 X 4 square matrix; and the 
last is a 3 X 5 matrix. 

Two matrices having the same shape (i.e., having the same number 
of rows and columns) are said to be equal if and only if the correspond¬ 
ing entries are equal. 

Recall that in Chapter IV, Section 13, we found that a matrix arose 
naturally in the consideration of a Markov chain process. To give 
another example of how matrices occur in practice and are used in 
connection with vectors we consider the following example. 


Example 1 . Suppose that a building contractor has accepted or¬ 
ders for five ranch style houses, seven Cape Cod houses, and twelve 
Colonial style houses. We can represent his orders by means of a row 
vector x = (5,7,12). The contractor is familiar of course, with the 
kinds of “raw materials” that go into each type of house. Let us 
suppose that these raw materials are steel, wood, glass, paint, and 
labor. The numbers in the matrix below give the amounts of each 
raw material going into each type of house, expressed in convenient 
units. (The numbers are put in arbitrarily, and are not meant to be 
realistic.) 



Steel 

Wood 

Glass 

Paint 

Labor 

Ranch: 

/ 5 

20 

16 

7 

17 

Cape Cod: 

( 7 

18 

12 

9 

21 

Colonial: 

V 6 

25 

8 

5 

13 


Observe that each row of the matrix is a five-component row vector 
which gives the amounts of each raw material needed for a given kind 
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of house. Similarly, each column of the matrix is a three-component 
column vector which gives the amounts of a given raw material 
needed for each kind of house. Clearly, a matrix is a very succinct 
way of summarizing this information. 

Suppose now that the contractor wishes to compute how much of 
each raw material to obtain in order to fulfill his contracts. Let us 
denote the matrix above by R; then he would like to obtain something 
like the product xR, and he would like the product to tell him what 
orders to make out. The product should have the following form: 


( 5 20 16 7 17\ 

7 18 12 9 21 ] 

6 25 8 5 13/ 

= (5-5 + 7-7 + 12-6, 5-20 + 7-18 + 12-25, 
5-16 + 7-12 + 12-8, 5-7 + 7-9 + 12-5, 


5-17 + 7-21 + 12-13) 
= (146,526,260,158,388). 


Thus we see that the contractor should order 146 units of steel, 526 
units of wood, 260 units of glass, 158 units of paint, and 388 units 
of labor. Observe that the answer we get is a five-component row 
vector and that each entry in this vector is obtained by taking the 
vector product of x times the corresponding column of the matrix R. 

The contractor is also interested in the prices that he will have to 
pay for these materials. Suppose that steel costs $15 per unit, wood 
costs $8 per unit, glass costs $5 per unit, paint costs $1 per unit, and 
labor costs $10 per unit. Then we can write the cost as a column 
vector as follows: 


V = 


/15\ 
8 
5 

W 


Here the product Ry should give the costs of each type of house, so 
that the multiplication should have the form 
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Ry = 






!5\ 

20 

16 

7 

17 \ 

8 

18 

12 

9 

21 

5 

25 

8 

5 

13/ 

1 





W 

15 + 

20' 

8 + 16- 

5 + 


7-15 + 18-8 + 12-5 + 9- 
^6-15 + 25-8 + 8-5 + 5' 

^ 492 \ 

528). 

U65/ 


1 + 21 
1 + 13 


10 

•10 

• 10 , 


) 


Thus the cost of materials for the ranch style house is $492, for the 
Cape Cod house is $528, and for the Colonial house $465. 

The final question which the contractor might ask is what is the 
total cost of raw materials for all the houses he will build. It is easy 
to see that this is given by the vector xRy. We can find it in two ways 
as shown below. 


xRy = (xR)y = (146,526,260,158,388) 


/15\ 

8 
5 

w 


= 11,736 


xRy = x(Ry) = (5,7,12) 



= 11,736. 


The total cost is then $11,736. 

We shall adopt, in general, the above definitions for the multiplica¬ 
tion of a matrix times a row or a column vector. 


Definition. Let A be an m X n matrix, let x be an m -component 
row vector, and let wbea n-component column vector; then we define 
the products xA and Aw as follows: 



Mi 

a i2 . . 

. CLln \ 

xA = (xi,X 2 ,. 

xj aa 

«22 

. d2n I 


\Om 1 

d m 2 • • 

• dmnf 


= (xidn + X2CI21 + . . . + x m a m 1, Xian + X2O22 + . . . + x m a m 2> . . . T 

Xldin + X 2 d 2n + • • • + Xmdmrt) > 
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\\ 

fdnUi + a 12 u 2 + . . 

. . + d\ n U n \ 

/an 


• • din \ 

u 2 ! 

€ 

II 

3 

d 22 

. . a 2n \ 


= I d 2 lUi ~f- CL22U2 . . 

. • + d 2 n U n I 

Y^ml 

0 >m 2 * 

■ • • dmnf 

\u n J 

\d m \Ui -b d m 2 U 2 + . 

. . 4 ” d-mr/Unf 


The reader will find these formulas easy to work with if he ob¬ 
serves that each entry in the products xA or Au is obtained by vector 
multiplication of x or u by a column or row of the matrix A . Notice 
that in order to multiply a row vector times a matrix, the number 
of rows of the matrix must equal the number of components of the 
vector, and the result is another row vector; similarly, to multiply 
a matrix times a column vector, the number of columns of the matrix 
must equal the number of components of the vector, and the result 
of such a multiplication is another column vector. 

Some numerical examples of the multiplication of vectors and 
matrices are: 



Observe that if x is an m-component row vector and A is m X n, then 
xA is an n-component row vector; similarly, if u is an n-component 
column vector, then Au is an ra-component column vector. These 
facts can be observed in the examples above. 


Example 2. In Exercise 6 of Chapter IV, Section 13, we considered 
a Markov chain with transition matrix 
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The initial state was chosen by a random device that selected states 
ai and a 2 each with probability Let us indicate the choice of initial 
state by the vector p (0) = where the first component gives the 
probability of choosing state ai and the second the probability of 
choosing state a 2 . Let us compute the product p (0) P. We have 

P®P-(M)(j |) 

= (i + h i +1) 

— (A,A)- 

Using the methods of Chapter IV, one can show that after one step 
there is probability ^ that the process will be in state ai and probabil¬ 
ity xv that it will be in state a 2 . Let p (1) be the vector whose first com¬ 
ponent gives the probability of the process being in state cti after one 
step and whose second component gives the probability of it being 
state a 2 after one step. In our example we have p (1) = = p (0) P. 

In general the formula p (1> = p (0) P holds for any Markov process 
with transition matrix P and initial probability vector p (0) . 


EXERCISES 


1. Perform the following multiplications: 

, 1)G) - * 


(a) 


(b) (3 ; 


(4 

■-*>(4 “i)-» 


/ 1 3 

7 -1 


(c) 


°\ 

3 



H) 


= ? 




= ? 


( 1 7-8 9 10\ 

3-1 14 2 —6 ) = ? 

0 3 -5 7 0/ 


[Arcs. (11,-11).] 


[Ans. (0,0).] 
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<*> 5) - 

(h > C 2X«)' ! 


[Arts, (ax i + cz 2 , fori + cfe).] 



A 0 

(j) (xi,x 2 ,xs )( 0 1 0 I = ? 

\0 0 1 / 

2. What n umb er does the matrix in parts (i) and (j) above resemble? 

3. Notice that in Exercise 1(d) above the product of a row vector, none 
of whose components is zero, times a matrix, none of whose components is 
zero, yields the zero row vector. Find another example which is similar to 
this one. Answer the analogous question for Exercise 1(e). 

4. When possible, solve for the indicated quantities. 


(a) (x h x 2 )[ 


; case can 


)^7 3 ) = vec ^ or Xt [Ans. (3,1).] 

(b) (2, —l)^ a ^ = (6,3). Find the matrix ^ ^ In this < 
you find more than one solution? 

(c) 


(d) 


( 1 = ( 4 ) Find ^ vector u% 

( _ i J)(:)'(-6> r ““- 


How many solutions can you find? 


( 4& — 3\ 

^ j, for any number k.] 


5. Solve for the indicated quantities below and give an interpretation for 
each. 


(a) 


[. Ans . a = 2.] 


(!, — !)( _2 4 ) = find «• 

(b) ^ = <*> *' in< ^ u ' ® ow many answers can y° u f' ind ’ 

(i) 


[Ans. u 


for any number k.] 


(c) 


l' 5 ¥ II 1 1 = | j; find u. How many answers are there? 

\t £/\W \U2/ 
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6. In Exercise 5 of the preceding section construct the 3X5 matrix 
whose rows give the various purchases of Brown, Jones, and Smith. Multiply 
on the right by the five-component price (column) vector to find the three- 
component column vector whose entries give each person’s grocery bill. 
Multiply on the left by the row vector x = (1,1,1) and on the right by the 
price vector to find the total amount that they spent in the store. 

7. In Example 1 of this section, assume that the contractor is to build 
seven ranch style, three Cape Cod, and five Colonial type houses. Recom¬ 
pute, using matrix multiplication, the total cost of raw materials, in two 
different ways as in the example. 

8. In Example 2 of this section, assume that the initial probability vector 

is p (0) = (i,|). Find the vector p a \ [Ans. (£, f).] 

9. For the Markov process whose transition matrix is 



assume the initial probability vector is p (0) = (i,l,J). Draw the tree of the 
process and find the tree measures. Compute p a) by means of the tree 
measure and also from the formula p {1) — p {0) P and show that the two 
answers agree. 

10. Consider the Markov process with two states whose transition matrix 
is 



where a and b are nonnegative numbers. Suppose the initial probability 
vector for the process is 

(0) / (0) (OK 

P = (Pi ,P2 ) 

where p® is the initial probability of choosing state 1 and p®' is the initial 
probability of choosing state 2. Derive the formulas for the components of 
the vector p a) . [. Ans . p a) = {ap^ + (1 — &)p 2 °\ (1 — a)pi 0) + ^pT) •] 

11 . In Example 2 use tree measures to show that p {2) = p a) P. 

12. The following matrix gives the vitamin contents of three food items, 
in conveniently chosen units: 

Vitamin: A B C D 

Food I: /.5 .5 0 0\ 

Food II: ( .3 0 .2 .1 ) 

Food III: \.l .1 .2 .5/ 

If we eat 5 units of food I, 10 units of food II, and 8 units of food III, how 

much of each type of vitamin have we consumed? If we pay only for the 
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vitamin content of each food, paying 10 cents, 20 cents, 25 cents, and 50 
cents, respectively, for units of the four vitamins, how much does a unit of 
each type of food cost? Compute in two ways the total cost of the food we ate. 

M 

[Ans. (6.3,3.3,3.6,5.0); 13 1; $4.69.] 

w 


4. THE ADDITION AND MULTIPLICATION OF MATRICES 


Two matrices of the same shape, that is, having the same number 
of rows and columns, can be added together by adding corresponding 
components. For example, if A and B are two 2X3 matrices, we 
have 


A + B 


-C 


an ai2 

a 2 i a 22 
an + frii 


a 2 i + 6 2 1 a 22 + 5 2 ; 


(bn bn bi%\ 

\) V>21 ?>22 £>23/ 

> 


ai3 
a 2 3, 

ai 2 + bn 


bn by, 

5 2 2 ?> 2 ; 

ai3 + biz 

a 2 3 + 6 2 3 


Observe that the addition of vectors (row or column) is simply a 
special case of the addition of matrices. Numerical examples of the 
addition of matrices are the following: 


(1,0,-2) + (0,5,0) = (1,5,-2); 

G !M _ i -?) - (2 o> 



7 

0 

0\ 


/- 8 

0 

i\ 


(~ l 

0 

1 \ 

- 

-3 

1 

-6 


4 

5 

-i 


1 

6 

-7 


4 

0 

7 

+ 

0 

3 

0 

= 

4 

3 

7 


0 

-2 

-2 


-1 

1 

-1 


. -1 

-1 

-3 


1 

1 

V 


( 0 

-4 

2/ 


( 1 

-3 

3/ 


Other examples occur in the exercises. The reader should observe 
that we do not add matrices of different shapes. 

If A is a matrix and k is any number, we define the matrix kA as 


t 

a ii 

a i2 . , 

. . Clin \ 

/kan 

ka 12 

, . . kCLi n \ 

a 2 i 

a 22 . . 

. . a 2n 

| j ka 2 1 

/ca 2 2 

, . . ka 2n 

\flml 

a m2 

. . CLmnJ 

\ka m i 

ka m 2 

, . • kdmnj 


Observe that this is merely component-wise multiplication as was the 
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analogous concept for vectors. Some examples of multiplication of 
matrices by constants are 



The multiplication of a vector by a number is, of course, a special 
case of the multiplication of a matrix by a number. 

Under certain conditions two matrices can be multiplied together 
to give a new matrix. As an example, let A be a 2 X 3 matrix and 
B be a 3 X 2 matrix. Then the product AB is found as 


AB 



( ftnfrn + ft 12&21 + «i 3 ?> 3 i Unfri2 + fti2&22 + ft 13^2 \ 

ft21?>ll + ft22&21 + ft23&31 ft21&12 + «22&22 + ft23?>32/ 


Observe that the product is a 2 X 2 matrix. Also notice that each 
entry in the new matrix is the product of one of the rows of A times 
one of the columns of B ; for example, the entry in the second row 
and first column is found as the product 


(ft21 ft 22 



ft2lfrll + ft22&21 + ft23&31. 


The following definition holds for the general case of matrix multi¬ 
plication : 


Definition. Let A be an m X k matrix and B be a k X n matrix; 
then the product matrix C = AB is an m X n matrix whose com¬ 
ponents are 


M 


t>2j 


Cij — (ftil fti'2 ... &ik) 


\bkj 


= di\b\j + ft^2i + . . . + CLikbkj . 
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The important things to remember about this definition are: first, 
in order to be able to multiply matrix A times matrix B y the number 
of columns of A must be equal to the number of rows of B ; second, 
the product matrix C — AB has the same number of rows as A and 
the same number of columns as B ; finally, to get the entry in the ith. 
row and jth. column of AB we multiply the ?'th row of A times the 
jth column of B. Notice that the product of a vector times a matrix 
is a special case of matrix multiplication. 

Below are several examples of matrix multiplication: 

C "iX-5 -!) - (“ -!> 

/ 3 0 l\/l 0 0\ / 4 1 l\ 

(-1 2 OHO -1 0) = ( — 1 -2 0); 

\ 0 0 2 /\l 11 / \ 2 2 2 / 



10 4 4\ 
6 5 5/ 


One obvious question that now arises is that of multiplying more 
than two matrices together. Let A be an m X h matrix, let B be an 
h X k matrix, and let C be a k X n matrix. Then we can certainly 
define the products (AB)C and A(BC). It turns out that these two 
products are equal, and we define the product ABC to be their com¬ 
mon value, i.e., 

ABC = A(BC) - (AB)C. 


The rule expressed in the above equation is called the associative law 
for multiplication. We shall not prove the associative law here al¬ 
though the student will be asked to check an example of it in Ex¬ 
ercise 5. 

If A and B are square matrices of the same size, then they can be 
multiplied in either order. It is not true, however, that the product 
AB is necessarily equal to the product BA . For example, if 

A -C i) and ®-G s) 

then we have 

- - G iXl o) - C o) 
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whereas 


- - (I X S) - (1 !> 


and it is clear that AB BA . 


EXERCISES 

1. Perform the following operations. 

/ e A / 4 2 \ 

(a) 2{ 0 -3 - 3( 0 1 = ? 

\-l 2/ \-5 -1/ 

«G -i -in;: :0= j 
<°>(» -iXi ? 

(d, G J "i)(j j)-» 

«(-i 

/ 4 1 4\/ 3 0 l\ 

(f) -1 -2 -1 )( -1 2 0 ) = ? [Ans. 

V 2 -1 —2/V 0 0 2/ 


-1 J 


[Ans. ^ 




o 

0 

f—7 

9 

-5 

6 0\ 

7 

5 

Ul 

0 

3 

—4 1/ 

^ < 

1 

00 ( 






\ 0 - 2 / 

2. Let A be any 3X3 matrix and let I be the matrix 

A o o\ 

I = I 0 1 0 . 

\o 0 1/ 

Show that A/ = IA = A. The matrix I acts for the products of matrices in 
the same way that the number 1 acts for products of numbers. For this reason 
it is called the identity matrix. 



202 


VECTORS AND MATRICES 


[Chap. V 


3. Let A be any 3X3 matrix and let 0 be the matrix 

/0 0 0 \ 

0 = ( 0 0 0 ). 

\0 0 0 / 

Show that AO = 0A = 0 for any A . Also show that A+0=0 +A=A 
for any A. The matrix 0 acts for matrices in the same way that the number 0 
acts for numbers. For this reason it is called the zero matrix. 

4 - IfA = (S ?) and B = C o) showthat AB = (o o)' 

Thus the product of two matrices can be the zero matrix even though neither 
of the matrices is itself zero. Find another example that illustrates this point. 


5. Verify the associative law for the special case when 

*-(-} J * 

6 . Consider the matrices 

A “ ( 1 0 5 

\-l 17 57/ 


D 


The shapes of these are 2 X 3, 4 X 3, 3 X 3, and 3X2, respectively. What 
is the shape of 

(a) AC. 

(b) DA. 

(c) AD. 

(d) BC. 

(e) CB. 

(f) DAC. 

(g) BCDA . [Ans. 4 X 3.] 



7. In Exercise 6 find: 

(a) The component in the second row and second column of AC. 

[Ans. 40.] 

(b) The component in the fourth row and first column of BC. 

(c) The component in the last row and last column of DA. [Ans. 58.] 

(d) The component in the first row and first column of CB. 
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8. If A is a square matrix, it can be multiplied by itself; hence we can 
define (using the associative law) 

A 2 = A-A 

A 3 = A 2 'A = A-A'A 


A n = A n_1 -A = A-A- ... A (n factors). 


These are naturally called “powers” of a matrix—the first one being called 
the square; the second, the cube; etc. Compute the indicated powers of 
the following matrices. 


(a) If A 


-G ”> 


find A 2 , A 3 , and A 4 . 


[AnS ' (l5 ill)’ ( 6 3 64 ) ( 2 55 25 6 } ] 


(b) If I and 0 are the matrices defined in Exercises 2 and 3, find 
J 2 , I\ J n , 0 2 , 0 3 , and 0 n . 


/ 0 0 o\ 

(c) If A = I 1 0 0 1, find A 2 , A 3 , and A n . 

\2 -1 0 / 


(d) If A = Q 


D 


find A n . 


9. Cube the matrix 



Compare your answer with the matrix P (3) in Example 1, Chapter IV, 
Section 13, and comment on the result. 


10. Consider a two-stage Markov process whose transition matrix is 



(a) Assuming that the process starts in state 1, draw the tree and set 
up tree measures for three stages of the process. Do the same, 
assuming that the process starts in state 2. 

(b) Using the trees drawn in (a), compute the quantities pn, 

P 21 , P 22 - Write the matrix P (3) . 

(c) Compute the cube P 3 of the matrix P. 

(d) Compare the answers you found in parts (b) and (c) and show 
that P< 3 > = P 3 . 
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11. Show that the fifth and all higher powers of the matrix 

/o 1 o\ 



have all entries positive. Show that no smaller power has this property. 


12. In Example 1 of Section 3 assume that the contractor wishes to take 
into account the cost of transporting raw materials to the building site as 
well as the purchasing cost. Suppose the costs are as given in the matrix 
below: 


Purchase Transport 


/ 15 

4.5 \ 

Steel 

8 

2 1 

Wood 

5 

3 

Glass 

1 

0.5 | 

Paint 

\ io 

0 / 

Labor 


Referring to the example: 

(a) By computing the product RQ find a 3 X 2 matrix whose entries 
give the purchase and transportation costs of the materials for 
each kind of house. 

(b) Find the product xRQ, which is a two-component row vector 
whose first component gives the total purchase price and second 
component gives the total transportation cost. 


(c) Let z = 



and then compute xRQz, which is a number giving 


the total cost of materials and transportation for all the houses 
being built. [Arts. $14,304.] 


13. A college survey at an all-male school shows that dates of students 
are distributed as follows: a freshman dates one blonde and one brunette 
during the year; each sophomore dates one blonde, three brunettes, and one 
redhead; each junior dates three blondes, two brunettes, and two redheads; 
each senior dates three redheads. It is further known that each blonde brings 
three dresses with her, two skirts, two blouses, and one sweater; each brunette 
brings five dresses, four skirts, one blouse, and three sweaters; each redhead 
brings one dress, four skirts, and four sweaters. If each dress costs $50, 
each skirt $15, each blouse $10, and each sweater $5; and if there are 500 
freshmen, 400 sophomores, 300 juniors, and 200 seniors, 

(a) What is the total number of blondes, brunettes, and redheads 
dated? 

(b) What is the total number of each type of clothing item in the 
dates’ wardrobes? 
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(c) What is the cost of the wardrobe of a blonde? a brunette? a 
redhead? 

(d) What is the total cost of all the wardrobes of all the dates? Cal¬ 
culate two ways. [Arts. $1,347,500.] 


5. THE SOLUTION OF LINEAR EQUATIONS 

There are many occasions when the simultaneous solutions of linear 
equations is important. In this section we shall develop methods for 
finding out whether a set of linear equations has solutions, and for 
finding all such solutions. 


Example 1 . Consider the following example of three linear equa¬ 
tions in three unknowns: 


(1) 

Xi + 4x 2 + 3x 3 = 1 

(2) 

2xi + 5 x 2 + 4x 3 = 4 

(3) 

Xi — 3 x 2 — 2x 3 = 5. 


Before we discuss the solution of these equations we note that they 
can be written as a single equation in matrix form as follows: 


4 

5 

-3 


3 

4 

- 2 , 




One of the uses of vector and matrix notation is in writing a large 
number of linear equations in a single simple matrix equation such 
as the one above. 

The method of solving the linear equations above is the following. 
First we use equation (1) to eliminate the variable Xi from equations 
(2) and (3); i.e., we subtract 2 times (1) from (2) and then subtract 
(1) from (3), giving 

(F) Xi + 4x2 + 3x3 = 1 

(2') -3x 2 - 2x3 = 2 

(3') -7x 2 - 5x 3 = 4. 

Next we divide equation (2') through by the coefficient of x 2 , namely, 
— 3, obtaining x 2 + fx 3 = — f. We use this equation to eliminate x 2 
from each of the other two equations. In order to do this we subtract 
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4 times this equation from (1') and add 7 times this equation to (3'), 
obtaining 

( 1 ") 

( 2 ") 

(3") 

The last step is to divide through (3") by —J, which is the coefficient 
of x 3 , obtaining the equation x 3 = 2; we use this equation to eliminate 
#3 from the first two equations as follows: 


d"') 

X! + 0 + 0 = 

3 

(2'") 

x 2 + o = ■ 

— 2 

(3"') 

Xz = 

2. 


The solution can now be read from these equations as xi = 3, 
%2 — —2, and x 3 = 2. The reader should substitute these values into 
the original equations (1), (2), and (3) above to see that the solution 
has actually been obtained. 

In the example just discussed we saw that there was only one solu¬ 
tion to the set of three simultaneous equations in three variables. 
Example 2 will be one in which there is more than one solution, and 
Example 3 will be one in which there are no solutions to a set of three 
simultaneous equations in three variables. 

Example 2. Consider the following linear equations. 

(4) xi — 2x 2 — Sx z = 2 

(5) xi — 4x 2 “ 13x 3 = 14 

( 6 ) —3xi + 5 x 2 + 4x 3 = 0 . 

Let us proceed as before and use equation (1) to eliminate the variable 
Xi from the other two equations. We have 

(4') xi — 2x 2 — 3x 3 = 2 

(5') -2^2 - 10x 3 = 12 

(60 x 2 - 5 x 3 = 6. 

Proceeding as before, we divide equation (50 by —2, obtaining the 
equation X 2 + 5x 3 = —6. We use this equation to eliminate the vari¬ 
able X 2 from each of the other equations—namely, we add twice this 
equation to (40 and then add the equation to (60- 


Xi + 0 + |x 3 = V* 
z 2 + ix z = -f 
-fca = “f. 
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(4") 

xi + 0 + 7x s = 

-10 


(5") 

Xi + 5x-i = 

-6 


(6") 

0 = 

0. 



Observe that we have eliminated the last equation completely! We 
also see that the variable x 3 can be chosen completely arbitrarily in 
these equations. To emphasize this, we move the terms involving Xz 
to the right-hand side, giving 

(4'") xi = -10 - 7xz 

(5'") x 2 = — 6 — 5x 3 . 

The reader should check, by substituting these values of x x and x 2 
into equations (4), (5), and (6), that they are solutions regardless of 
the value of x 3 . Let us also substitute particular values for x% to obtain 
numerical solutions. Thus, if we let x 3 = 1,0, — 2, respectively, and 
compute the resulting numbers, using (4'") and (5"'), we obtain the 
following numerical solutions: 

Xi = —17, x 2 = —11, Xz = 1 

Xi = —10, x 2 = —6, Xz — 0 

Xi = 4, x 2 = 4, xz = —2. 

The reader should also substitute these numbers into (4), (5), and (6) 
to show that they are solutions. To summarize, our second example 
has an infinite number of solutions, one for each numerical value of 
xz which is substituted into equations (4'") and (5'"). 

Example 3. Suppose that we modify equation (6) by changing 
the number on the right-hand side to 2. Then we have 

(7) xi — 2x 2 — 3x 3 = 2 

(8) xi — 4x 2 — 13^3 = 14 

(0) — 3xi + 5x 2 + 4 x 3 = 2. 

If we carry out the same procedure as before and use (7) to eliminate 
Xi from (8) and (9), we obtain 

(7') Xi — 2x 2 — 3x 3 = 2 

(8') -2x 2 ~ 10x 3 = 12 

(9') — x 2 — 5 x 3 = 8. 

We divide (8') by —2, the coefficient of x 2 , obtaining, as before, 
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x 2 + 5x 3 = — 6. Using this equation to eliminate x 2 from the other 
two equations, we have 


(7") 

Xi H“ 0 + 7x3 = 

-10 

(8") 

x 2 + 5^3 = 

-6 

(9") 

0 = 

2. 


Observe that the last equation is false . Because our elimination pro- 
cedure has led to a false result we conclude that the equations (7), 
(8), and (9) have no solution. The student should always keep in 
mind that this possibility exists when considering simultaneous equa¬ 
tions. 

In the examples above the equations we considered had the same 
number of variables as equations. The next example has more vari¬ 
ables than equations and the last has more equations than variables. 

Example 4. Consider the following two equations in three vari¬ 
ables: 

(10) —4xi + 3^2 + 2 x 3 “ —2 

(11) 5xi — 4 x 2 + x 3 = 3. 

Using the elimination method outlined above, we divide (10) by —4, 
and then subtract 5 times the result from ( 11 ), obtaining 

(10') Xi — fx 2 — ix 3 = j 

(11') — \x 2 + -1x3 = 

Multiplying (11') by —4 and using it to eliminate x 2 from (10'), we 
have 

(10") Xi + 0 - llxs = -1 

(11") x 2 - 14x 3 = —2. 

We can now let X 3 take on any value whatsoever and solve these 
equations for xi and x 2 . We emphasize this fact by rewriting them 
as in Example 2 as 

(10"') xi = llxs - 1 

( 11 "') x 2 = 14x3 - 2 . 

The reader should check that these are solutions and also, by choosing 
specific values for X3, find numerical solutions to these equations. 
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Example 5. Let us consider the other possibility suggested by 
Example 4, namely, the case in which we have more equations than 
variables. Consider the following equations: 

(12) — 4xi + 3x 2 = 2 

(13) 5xi — 4x 2 — 0 

(14) 2xi — X 2 — a, 

where a is an arbitrary number. Using equation (12) to eliminate x t 
from the other two we obtain 


(12') Xi — 1x2 = 

(13') -i^2 = f 

(14') \x 2 = a + 1. 

Next we use (13') to eliminate X 2 from the other equations, obtaining; 

(12") a* + 0 = -8 

(13") z 2 = -10 

(14") 0 — a + 6. 

These equations remind us of the situation in Example 3, since we 
will be led to a false result unless a = —6. We see that equations 
(12), (13), and (14) have the solution xi = — 8 and x 2 = —10 only if 
a — —6. If a 5* —6, then there is no solution to these equations. 


The examples above illustrate all the possibilities that can occur 
in the general case. There may be no solutions, exactly one solution, 
or an infinite number of solutions to a set of simultaneous equations. 

The procedure that we have illustrated above is one that turns any 
set of linear equations into an equivalent set of equations from which 
the existence of solutions and the solutions can be easily read. A stu¬ 
dent who learned other ways of solving linear equations may wonder 
why we use the above procedure—one which is not always the quick¬ 
est way of solving equations. The answer is that we use it because it 
always works, that is, it is a canonical procedure to apply to any set 
of linear equations. The faster methods usually work only for equa¬ 
tions that have solutions, and even then they may not find all solu¬ 
tions. The value of a standard infallible method, especially for ma¬ 
chine computation, should not be underestimated. 
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EXERCISES 

1. Find all the solutions of the following simultaneous equations. 

(a) 4xi + 5x 3 — 6. 

x 2 — 6x 3 = — 2. 

Sxi + 4x 3 = 3. [ Arts . xi = 9, x 2 = —38, x 3 = —6.] 

(b) 3xi — x 2 — 2x 3 = 2. 

2x 2 ~ x 3 = — 1. 

3xi — 5x2 — 3. [Aws. No solution.] 

(c) —Xi + 2 x 2 “I - 3x3 = 0. 

Xi — 4x 2 — 13x 3 = 0. 

—3xi + 5x2 + 4x3 = 0. [Ans. Xi = —7x 3 , x 2 — — 5x 3 .] 

2. Find all the solutions of the following simultaneous equations. 

(a) Xi + x 2 + x 3 = 0. 

2xi + 4 x 2 + 3x 3 = 0. 

4x 2 + 4 x 3 = 0. 

(b) xi + x 2 + x 3 — -2. 

2xi + 4x 2 + 3 x 3 = 3. 

4x 2 + 2x 3 = 2. 

(c) 4xi + 4x 3 = 8. 

x 2 — 6x 3 = —3. 

3xi + x 2 — 3x 3 = 3. 

3. Find numbers xi, x 2 , and x 3 that solve the equations given in Exercise 
1(c) and that also satisfy the nonlinear equation 

xi[2x 2 — 5x 3 ] — 1. 

4. Find all solutions of the following equations: 


(a) 

5xi 

— 

3x 2 

= 

-7. 






— 2 xi 

+ 

9x 2 

= 

4. 






2 xi 

+ 

4x 2 

= 

- 2 . 



[Ans. xi = — 

H-; £2 — -ft-] 

(b) 

Xi 

+ 

2x 2 

= 

1. 






—3xi 

+ 

2x 2 

= 

- 2 . 






2 xi 

+ 

3x 2 

= 

1. 



[Ans. 

No solution.] 

(c) 

5xi 

— 

3x 2 

— 

7x 3 + 

Xi = 

10. 




— Xi 

+ 

2x 2 

+ 

6 x 3 — 

3x4 = 

-3. 




Xi 

+ 

x 2 

+ 

4x 3 — 

5x4 — 

0. 





Sec. 5] 


VECTORS AND MATRICES 


211 


5. Show that the equations 

—4*i + 3 * 2 + axz — c 
5*i — 4*2 + 6*3 = d 

always have a solution for all values of a, b, c, and d. 

6 . Find conditions on a, b, and c in order that the equations 

—4*i + 3*2 = a 
5*i — 4*2 — 6 
—3*i + 2*2 = c 


have a solution. 

7 . (a) Let * = (*i,* 2 ) and let A be the matrix 
A 


- (I :»> 


Find all solutions of the equation xA — x. 

(b) Let * = (*i,* 2 ) and let A be the matrix 


[. Arts . 2a + 6 = c.] 


[Ans. * = (0, 0).] 


-U -!> 


Find all solutions of the equation xA = x. 

[. Ans . x = (k,k) for any number k .] 

8. Let * = (*i,* 2 ) and let P be the matrix 

p -(l O' 

(a) Find all solutions of the equation xP = *. 

(b) Choose the solution for which *i + *2 = 1. 

9. If * = (*i,* 2 ,* 3 ) and A is the matrix 

[l -2 0 \ 

\0 -6 -4/ 

find all solutions of the equation xA = *. 

[Ans. x = (—A :/2,ok/4:jk) for any number k.] 

10. If * = (*i,^ 2 ,* 3 ) and P is the matrix 


P = 


find all solutions of the equation xP = *. Select the unique solution for 
which *1 + *2 + Xz = 1 . 
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11 . Find all solutions of: 

Xi + 2x 2 + 3x 3 + 4x 4 — 10. 

2xi — x 2 + x 3 — x 4 = 1. 

3xi + x 2 + 4x3 + 3x 4 = 11. 

—2xi + 6 x 2 + 4x 3 + 10x 4 = 18. 

[Ans. xi = — x 3 — 2x 4 /5; x 2 = — x 3 — 9x 4 /5.] 

12. We consider buying three kinds of food. Food I has one unit of 
vitamin A, three units of vitamin B, and four units of vitamin C. Food II 
has two, three and five units, respectively. Food III has three units each of 
vitamin A and vitamin C, none of vitamin B. We need to have 11 units of 
vitamin A, 9 of vitamin B, and 20 of vitamin C. 

(a) Find all possible amounts of the three foods that will provide 
precisely these amounts of the vitamins. 

(b) If Food I costs 60 cents and the others cost 10 cents each per unit, 
is there a solution costing exactly $1? [Ans. (b) Yes; 1,2,2.] 

6 . THE INVERSE OF A SQUARE MATRIX 


If A is a square matrix and B is another square matrix of the same 
size having the property that BA = I (where I is the identity ma¬ 
trix), then we say that B is the inverse of A. When it exists we shall 
denote the inverse of A by the symbol A -1 . To give a numerical 
■example, let A and A -1 be the following: 



If we multiply these matrices in the other order we also get the iden¬ 
tity matrix; thus 

/4 0 5\ / 4 0 -5\ /l 0 0\ 

AA" 1 = 0 1 -6 -18 1 24] = 10 1 0 = I. 

\3 0 4/ \ -3 0 4/ \0 0 1/ 
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In general it can be shown that if A is a square matrix with inverse 
A -1 , then the inverse satisfies the equation 

A -1 A = AA~ l = I. 


It is easy to see that a square matrix can have only one inverse. 
Suppose that in addition to A -1 we also have a B such that 

BA = I. 

Then we see that 

B = BI = BiAA- 1 ) = (5A)A“ 1 = /A" 1 = A~\ 

Finding the inverse of a matrix is analogous to finding the reciprocal 
of an ordinary number, but the analogy is not complete. Every non¬ 
zero number has a reciprocal, but there are matrices, not the zero 
matrix, which have no inverse. For example, if 

a -(-\ ~0 a,id b ’G 0 

then 

"-(-! ;Mo s)-»- 

From this it follows that neither A nor B can have an inverse. To 
show that A does not have an inverse, let us assume that A had an 
inverse A -1 . Then 

B = (A~ l A)B = A~\AB) = A ~ l 0 = 0 


contradicting the fact that B 0. The proof that B cannot have an 
inverse is similar. 

Let us try to calculate the inverse of a 2 X 2 matrix A; that is, 
let us find conditions on a matrix B that BA = I . Suppose 


j _ /1 0\ __ fb\i /dn 

\0 1/ \&21 622/ \&21 U22/ 


fan 

\Ct21 d22j 

( blldn 4 " 612^21 bndi 2 4 “ 612^22^ 

b 2 idu + 622^21 621U12 622^22/ 


For these equalities to hold, the following equations must be satisfied 


(1) 

biidu + bl 2 d 21 — 1 

(2) 

budi 2 4" bl 2 d 22 — 0 

0) 

&21U11 + &22U21 = 0 

(4) 

621^12 4" ^22^22 == 1' 
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In these equations the Vs are to be regarded as the unknowns and 
the a ?s as constants. Observe that the unknowns 6n and b\ 2 appear 
only in equations (1) and (2), while the unknowns b 2 1 and b 2 2 occur 
only in equations (3) and (4). In order to eliminate the variable bu 
from equations (1) and (2) we multiply equation (1) by a 22 and equa¬ 
tion (2) by a 2h giving 

biiCLi\(i 22 -b b\ 2 a 2 id 22 = a 2 2 

611 ^ 12^21 + bi 2 d 22 d 2 \ = 0 . 

Subtracting the second of these equations from the first, we obtain 

bu(dnd 22 — d\ 2 d 2 i) = d 22 . 

Now, providing the coefficient of bn in this equation is not zero, we 
can divide through by it and obtain the solution for bn as follows: 


611 


_ &22 _ _ 

dnd 22 — a 2 iai2 


Proceeding in this way for the other variables we can solve for the 
other variables, obtaining 


dnd 22 — a 2 \d \ 2 


b 2 \ 


— d 2 \ 

-j 

dnd - 22 — a2iai2 


b 22 


_0L1_ 

dnd 22 — a 2 iai2 


Notice that the denominator of each of these expressions is the same. 
This quantity is so important that it is given a special name, namely, 
the determinant of A, and it is defined as follows: 


det A 


dll 

d 2 l 


dl 2 
d 2 2 


= d\\d 22 — d 2 \d\ 2 . 


Notice that we use vertical bars to denote a determinant, while we 
use vertical parentheses to denote a matrix. 

Numerical examples of determinants are 


1 2 
3 4 

1 -1 
-1 1 


4 - 6 = -2, 
1 - 1 = 0 . 
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We have defined determinants only for 2 X 2 matrices. They can 
also be defined for higher-order square matrices, but we shall not 
carry it out here. 

Observe that we were able to solve for the inverse of a 2 X 2 matrix 
only if its determinant was not zero. Using the formulas derived pre¬ 
viously the inverse of the matrix, 


is 



) 



To check this we observe that 


G 0( 


-2 


3 

2 


On the other hand, the matrix 


(-} 


D-G!) 

-!) 


has determinant zero, so 


that the equations considered previously cannot be solved, and this 
matrix has no inverse. The same is true in general, and the following 
theorem can be stated. 


Theorem. A square matrix has an inverse if and only if its de¬ 
terminant is nonzero. 


The theorem holds for higher-order square matrices as well, but we 
will not prove it here. 


EXERCISES 

1. Find the determinants of the following matrices. 

«(J ?> 

<b > (o !} 

«•> (? i> 

«>(; i> 

« G :;> 


[Arts. 1.] 


r Ans . — 1.] 


[Ans, 0.] 
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® (J 4> 

[A ns. —13.] 

2. Find when possible the inverses of the matrices given in Exercise 1. 
Check your work. 

3. Let A be a square matrix that has an inverse. Show that the equations 
Ax = b have x = A~ x b as a solution. 

4 . Use the result of Exercise 3 to solve the equations 

Xi + 2x 2 — 1 

3zi + 4x 2 = 2. 

(Hint: The inverse of the matrix was computed in the text above.) 

[.Ans . Xi — 0, x 2 = £.] 

5. Let A be one of the matrices in Exercise 1 , and let 

* - (:) “ d 6 - (a> 

Using (when they exist) the inverses computed in Exercise 2 , solve the 
equations Ax = b. (See Exercise 3.) 

6 . If ad — cb 0 , find the inverse of the matrix 

7. Let A be the matrix of Exercise 6 . Let 

* - (»;) “ d iei/ ■ (a) 

Then use the inverse found in Exercise 6 to solve the linear equations Ax = /. 

8 . If A is a square matrix that has an inverse A -1 , show that the inverse 
of A 2 is the matrix [A ' 1 ] 2 = A“ 2 . Show that the inverse of A n is the matrix 
[A- 1 ]" = A~ n . 

9 . Solve the linear equations 

Ax + 5z =* 7 

y — 62 = 2 

3z + 43 - -1 

by writing them in the form Ax = b and using the inverse of A given in the 
beginning of this section. (See Exercise 3.) 

[Ans. x = 33, y = —148, z = —25.] 
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10. (a) Prove the following identity for determinants: 


a 

1 

a b 


\\c d) \g h) 1 

c d 



/ 

h 


(b) Show by an example that it is not true in general that the de¬ 
terminant of the sum of two matrices is equal to the sum of their 
determinants. 


11. Use the result of Exercise 10(a) to show that the determinant of the 
product of two matrices can be zero only if one of the matrices itself has a 
zero determinant. 


12. If A and B are n X n matrices each of which has an inverse: 

(a) Prove, using Exercise 11, that AB has an inverse. 

(b) Express the inverse of AB in terms of A -1 and B~ x . 


7. APPLICATIONS OF MATRIX THEORY TO MARKOV CHAINS 

For simplicity we shall confine our discussion to three-state Markov 
chains, but a similar procedure will work for any other Markov chain. 

In Section 13 of Chapter IV, we noted that to each Markov chain 
there was a matrix of transition probabilities. For example, if there 
are three states, a h a 2) and a 3 , then 


&i a 2 



is the transition matrix for the chain. Recall that the row sums of P 
are all equal to 1. Such a matrix is called a stochastic matrix. 

Definition. A stochastic matrix is a square matrix with nonnega¬ 
tive entries such that the sum of the entries in each row is 1. 

In order to obtain a Markov chain we must specify how the process 
starts. Suppose that the initial state is chosen by a chance device that 
selects state aj with probability pj°\ We can represent these initial 
probabilities by means of the vector p (0) = (pT\p 2 \pT)- As in Ex¬ 
ercise 10 of Section 4, we can construct a tree measure for as many 
steps of the process as we wish to consider. Let pj n) be the probability 
that the process will be in state aj after u steps. Let the vector of 
these probabilities be p (n) = (p?\p ( 2 \pf ) ). 
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Definition. A row vector p is called a probability vector if it ha« 
nonnegative components whose sum is 1 . 

Obviously the vectors p (0) and p in) are probability vectors. Als^ 
each row of a stochastic matrix is a probability vector. 

By means of the tree measure it can be shown that these probabili¬ 
ties satisfy the following equations: 

7>l n> = Pl” -1) Pll + P2 M_1) P21 + P3 n-1) p31 
P2° = Pl" _1> Pl2 + P2 M_1) P22 + P3 B_1) P32 
pf = pf-V + pT V + pf- 1> P33. 

It is not hard to give intuitive meanings to these equations. The first 
one, for example, expresses the fact that the probability of being in 
state ai after n steps is the sum of the probabilities of being at each 
of the three possible states after n — 1 steps and then moving to 
state ai on the nth step. The interpretation of the other equations is 
similar. 

If we recall the definition of the product of a vector times a matrix 
we can write the above equations as 

pin) — p(n—l)P' 

If we substitute small values of n we get the equations: p (1) = p (0) P; 
p( 2 > _ p(.i)p — p«»p 2 . p(3) _ p< 2 >p = p (0) P 3 ; e tc. In general, it can 
be seen that 

pin) __ p^P n t 

Thus we see that, if we multiply the vector p (0) of initial probabilities 
by the nth power of the transition matrix P, we obtain the vector 
p M , whose components give the probabilities of being in each of the 
states after n steps. 

In particular, let us choose p (0) = (1,0,0) which is equivalent to 
letting the process start in state a\. From the equation above we see 
that then p (n) is the first row of the matrix P n . Thus the elements 
of the first row of the matrix P n give us the probabilities that after n 
steps the process will be in a given one of the states, under the as¬ 
sumption that it started in state In the same way, if we choose 
p i0) = (0,1,0), we see that the second row of P n gives the probabilities 
that the process will be in one of the various states after n steps, given 
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that it started in state a, 2 . Similarly the third row gives these prob¬ 
abilities, assuming that the process started in state a 3. 

In Section 13 of Chapter IV, we considered special Markov chains 
that started in given fixed states. There we arrived at a matrix P (n) 
whose zth row gave the probabilities of the process ending in the 
various states, given that it started at state a { . By comparing the 
work that we did there with what we have just done, we see that the 
matrix P (n> is merely the nth power of P, that is, P (n) = P n . (Com¬ 
pare Exercise 10 of Section 4.) Matrix multiplication thus gives a 
convenient way of computing the desired probabilities. 

The equation p (n) = p (n ~ l) P admits of another interesting inter¬ 
pretation. The vector p M is obtained from the vector p {n ~ l) by a 
transformation consisting of multiplying it by the matrix P. The 
vector p (n_1) is obtained from p (n-2) by a similar transformation, etc. 
Let us indicate the transformation by an arrow; thus we have 

P®* —> —> pW —» . . t —» p(n—l) ^ p(n) > 

In Section 9 we shall show that this transformation is what is called 
a linear transformation of vectors. We shall say that the transforma¬ 
tion sends the vector p (0) onto the vector p (1) , and sends p m onto 
p <2) , etc. 

Sometimes it happens that there is a probability vector t which is 
sent by the transformation P onto itself, that is, t = tP. If we inter¬ 
pret t as a point in Euclidean space, then we say that t is a fixed point 
of the transformation P. 

Definition. The probability vector t is a fixed point of the trans¬ 
formation P, if t = tP. 


Example. Consider the stochastic matrix 


P = 


(\ i\ = /-667 

v§ y Wo 


.333\ 

.500/ 


If t = (.6,.4), then we see that 


tP = (.6,.4)(| |) = (.6,.4) = t, 
so that t is the fixed point of the transformation t. 
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If we had happened to choose the vector t as our initial probability 
vector p (0) , we would have had p (n) = p (0) P n = tP n = t = p (0) . In 
this case the probability of being at any particular state is the same 
at all steps of the process. Such a process we shall call a stationary 
Markov 'process. 

As seen above, in the study of Markov chains we are interested in 
the powers of the matrix P. To see what happens to these powers, 
let us further consider the example. 

Example (continued). Suppose that we compute powers of the 
matrix P in the example above. We have 



It looks as if the matrix P n is approaching the matrix 



and, in fact, it can be shown that this is the case. (When we say that 
P n approaches T we mean that each entry in the matrix P n gets close 
to the corresponding entry in T.) Note that each row of T is the fixed 
point t of the matrix P. 

We cannot prove here that P n approaches T, so we shall content our¬ 
selves with stating a useful result. For that we need the definition 
of a regular matrix. 1 

Definition. A stochastic matrix is said to be regular if some power 
of the matrix has only positive components. 

Thus the matrix in the example is regular, since every entry in it is 
positive, so that the first power of the matrix has all positive entries. 
Other examples occur in the exercises. 

Theorem. If P is a regular stochastic matrix, then 

(i) The powers P n approach a matrix T. 

(ii) Each row of T is the same probability vector t . 

(iii) The components of t are positive. 

We omit the proof of this theorem; however we can prove the next 
theorem. 
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Theorem. If P is a regular stochastic matrix, and T and t are 
given by the previous theorem, then 

(a) If p is any probability vector, pP n approaches t. 

(b) The vector t is the unique fixed point probability vector of P. 

Proof. First let us consider the vector pT. The first column of T 
has a k in each row. Hence in the first component of pT each com¬ 
ponent of p is multiplied by k, and therefore we have U times the sum 
of the components of p, which is k- Doing the same for the other 
components, we note that pT is simply t. But pP n approaches pT; 
hence it approaches t. Thus if any probability vector is transformed 
repeatedly by P, it approaches the fixed point t. This proves part (a). 

Since the powers of P approach T, P n+1 = P n P approaches T, but 
it also approaches TP; hence TP = T. Any one row of this matrix 
equation states that tP = t; hence t is a fixed point (and by the 
previous theorem a probability vector). We must still show that it is 
unique. Let u be any probability vector fixed point of P . By part (a) 
we know that uP n approaches t. But since u is a fixed point, uP n = u. 
Hence u remains fixed but “approaches” t. This is possible only if 
u - t. Hence t is the only probability vector fixed point. This com¬ 
pletes the proof of part (b). 

The following is an important consequence of this theorem. If we 
take as p the vector p ;0> of initial probabilities, then the vector 
pP» = pin) gi ves the probabilities after n steps, and this vector ap¬ 
proaches t Therefore, no matter what the initial probabilities are, 
if P is regular, then after a large number of steps the probability that 
the process is in state aj will be very nearly tj. 

Example (continued). Let us take p (0) = (.1,.9) and see how the 
successive transformations of it change. Using P as in the example 
above, we have that p (1) = (.5167,.4833), p (2) — (.5861,.4139), and 
p<3) = (.5977,-4023). Recalling that t = (.6,. 4), we see that these vec¬ 
tors do approach t. They are plotted in Figure 5. 

As a final example let us derive the formulas for the fixed point 
of a 2 X 2 stochastic matrix with positive components. Such a matrix 
is of the form 

s-(V .-*) 
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where 0 < a < 1 and 0 < b < 1. Since S is regular, it has a unique 
probability vector fixed point t = Its components must satisfy 

the equations 

£i(l — a) -f- t%b = t\ 

t\d ^(1 — b) = t 2 . 

Each of these equations reduces to the single equation ha = t 2 b. This 
single equation has an infinite number of solutions. However, since t 
is a probability vector, we must also have h + t 2 = 1, and the new 
equation gives the point [ b/ ( a+b),a/(a+b)\ as the unique fixed-point 
probability vector of S. 



Figure .i 


EXERCISES 

1. Which of the following matrices are regular? 

(a) (b) 0 [Regular] 

w G t) <d) (1 0) tE,w *'* ri 
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(e) ^ (f) [Not regular] 

A 0 A 

(h) I 0 1 0 1. [Not regular] 

\0 ii/ 

2. Show that the 2X2 matrix 

rr /l — a a \ 

l » 1-b) 

is a regular stochastic matrix if and only if either 

(i) 0 < a < 1 and 0 < b < 1; or 

(ii) 0 < a < 1 and 0 < b < 1. 

3. Find the fixed point for the matrix in Exercise 2 for each of the cases 
listed there. {Hint: Most of the cases were covered in the text above.) 

4. Find the fixed point t for each of the following regular matrices. 

[Ans. t = (§,*).] 


[AUS. t — (f;f;f)-] 

5. Let p {0) = (J,§) and compute p (1) , p (2) , and p (3) for each of the matrices 
in Exercises 4(a) and 4(b). Do they approach the fixed points of these 
matrices? 

6. Give a probability theory interpretation to the condition of regularity. 

7. Consider the two-state Markov chain with transition matrix 


CLl Cl 2 



What is the probability that after n steps the process is in state ai if it started 
in state a 2 ? Does this probability become independent of the initial position 
for large nl If not, the theorem of this section must not apply. Why? Does 
the matrix have a unique fixed point probability vector? 

8. Prove that, if a regular 3X3 transition matrix has the property that 
its column sums are 1, its fixed point probability vector is (i,i,i). State a 
similar result for n X n transition matrices having column sums equal to 1. 




(g) 0 i i 
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9. Compute the first five powers of the matrix 


(5 i> 


From these, guess the fixed point vector t. Check by computing what t is. 

10. Show that all stochastic matrices of the form 


( 1 — a a \ 

a l — a/ 


where 0 < a < 1 have the same unique fixed point. [Ans. t = (i,i).] 

11. In Exercise 9 take p = (.7,.3) and compute pP, pP 2 , pP 3 , pP *, and pP 5 . 
Compare your results with the fixed vector t . 

12. Compute the fixed vector of the matrix 


e j> 


Let p = (.5,.5) and compute pS, pS 2 , and pS z . Plot these vectors as is done 
in Figure 5. 

[Ans. pS = (.35,.65), pS 2 = (.425,.575). pS 3 = (.3875,.6125), t - (.4,.6).] 


13. Let S be the matrix 


(i !)■ 


Compute the unique probability vector fixed point of S, and use your result 
to prove that S is not regular. 

14. Show that the matrix 

A o o\ 

s= i 0 i) 

\0 0 \) 

has more than one probability vector fixed point. Find the matrix that S n 
approaches, and show that it is not a matrix all of whose rows are the same. 

15. Show that the matrix 

f.i?) 

Vi i 0/ 

is a. regular matrix. 


8. EXAMPLES OF MARKOV CHAINS 

In this section we shall apply the results of the last section to several 
different Markov chains. Some of these will be new Markov chains, 
and some will be taken from the examples in Chapter 4, Section 13. 
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Example 1 . Suppose that the President of the United States tells 
person A his intention either to run or not to run in the next election. 
Then A relays the news to B, who in turn relays the message to C, 
etc., always to some new person. Assume that there is a probability 
p > 0 that any one person, when he gets the message, will reverse it 
before passing it on to the next person. What is the probability that 
the nth man to hear the message will be told that the President will 
run? We can consider this as a two-state Markov chain, with states 
indicated by “yes” and “no.” The process is in state “yes” at time 
n if the nth person to receive the message was told that the President 
would run. It is in state “no” if he was told that the President would 
not run. The matrix P of transition probabilities is then 


yes no 



Then the matrix P n gives the probabilities that the nth man is given 
a certain answer, assuming that the President said “yes” (first row) 
or assuming that the President said “no” (second row). We know 
that these rows approach t. From the formulas of the last section,, 
we find that t — (§,i). Hence the probabilities for the nth man being 
told “yes” or “no” approach \ independently of the initial decision 
of the President. For a large number of people, we can expect that 
approximately one-half will be told that the President will run and the 
other half that he will not, independently of the actual decision of the 
President. 

Suppose now that the probability a that a person will change the 
news from “yes” to “no” when transmitting it to the next person is 
different from the probability b that he will change it from “no” to 
“yes.” Then the matrix of transition probabilities becomes 


yes 

no 


( 


no 



In this case t — [fr/(a+6),a/(a+&)]. Thus there is a probability 
of approximately b/(a + b) that the nth person will be told that the 
President will run. Assuming that n is large, this probability is in¬ 
dependent of the actual decision of the President. For n large we 
can expect, in this case, that a proportion approximately equal to 
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b/(a + b ) will have been told that the President will run, and a pro¬ 
portion a/{a + b) will have been told that he will not run. The 
important thing to note is that, from the assumptions we have made, 
it follows that it is not the President but the people themselves who 
determine the probability that a person will be told “yes” or “no,” 
and the proportion of people in the long run that are given one of 
these predictions. 

Example 2. For this example, we continue the study of Example 2 
in Chapter IV, Section 13. The first approximation treated in that 
example leads to a two-state Markov chain, and the results are similar 
to those obtained in Example 1 above. The second approximation led 
to a four-state Markov chain with transition probabilities given by 
the matrix 

RR DR RD DD 

RR /l - a 0 a 0 \ 

DR b 0 1 - b 0 

RD I 0 1 - c 0 c 

DD \ 0 d 0 1 - d) 

If a, b, c, and d are all different from 0 or 1, then the square of the 
matrix has no zeros, and hence the matrix is regular. The fixed prob¬ 
ability vector is found in the usual way (see Exercise 12) and is 

/_ bd _ _ ad _ _ ad _^_ ca _\ 

\bd + 2 ad + ca’ bd + 2 ad + ca bd + 2 ad + ca bd + 2 ad + ca) 

Note that the probability of being in state RD after a large number 
of steps is equal to the probability of being in state DR. This shows 
that in equilibrium a change from R to D must have the same prob¬ 
ability as a change from D to R. 

From the fixed vector we can find the probability that an election 
in the far future will result in a victory for the Republicans. This is 
found by adding the probability of being in state RR and DR, giving 

bd + ad 
bd 2 ad -f- ca 

Notice that, to find the probability of a Republican victory on the 
year preceding some year far in the future, we should add the prob¬ 
abilities of being in states RR and RD. That we get the same result 
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corresponds to the fact that predictions far in the future are essen¬ 
tially independent of the particular year being predicted. In other 
words, the process is acting as if it were a stationary process. 

Example 3. Here we continue the study of Example 3 in Chapter 
IV, Section 13. In that example, for any particular n we could write 
the transition matrix and solve for the fixed vector. However, it turns 
out to be more instructive to try and guess the answer. It would seem 
that after a large number of exchanges the balls should become pretty 
well scrambled up, and that the probability that any particular n 
balls should be in urn 1 should be the same as if we simply took n balls 
at random from 2 n balls, n black and n white, and put them in the 
urn. If this were done, the probability that we would put j black balls 
in the urn is 

, ("X.",) (")’ 

' _ (?) (?) 

It can be checked that these probabilities do satisfy the necessary 
equations for a fixed vector for the matrix, for any n. 

Example 4. The following example shows how the above ideas can 
be applied to a situation which does not seem to be a probabilistic 
situation but which can be so interpreted. 

Suppose that, in a certain city, each year four per cent of the people 
in the city (proper) move to the suburbs, and one per cent of the 
people in the suburbs move to the city. Assuming that the total 
number of people in the city plus its suburbs remains constant, what 
is the ultimate distribution of people between the city and suburbs? 
The matrix S of Figure 6 shows the situation. 

Move to Move to 
city suburbs 

People in city / .96 .04 \ 

People in suburbs \ .01 .99 / 

Figure 6 

Let £ <0) be the vector (xfVf”), where rrf* is the proportion of people 
in the city when the study begins and a4 0) is the proportion in the 
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suburbs at this time. Then denote by x in) = [x i n) ,a4 n) ) the vector 
giving the corresponding proportions after n years. Then, by the 
same argument as given for the probabilities in the case of a Markov 
chain, it follows that 

zp(n) _ x^P n . 

Also, since our matrix P could be interpreted as a regular transition 
matrix for a Markov chain, and since the vector x (0) is a probability 
vector, we can apply the theorem from the theory of Markov chains 
and obtain the fact that the vector x (n) approaches the vector 
t = ($i,fe), which is the unique probability vector fixed point of P. 
This vector is t = (.2,.8). Thus we can conclude that after a long 
time there will be a fraction approximately equal to 20 per cent of 
the people in the city proper and 80 per cent of the people in the sub¬ 
urbs, independent of the fraction initially in the city and in the 
suburbs. Notice that, after a long time has elapsed, a fraction 
(.2) • (.04) = .008 of the people move in a given year from the city 
to the suburbs and a fraction (.8) *(.01) = .008 of the people move 
in a given year from the suburbs to the city. That is, in the “equilib¬ 
rium” position which is reached after a long time, there are just as 
many people moving from the city to the suburbs as there are from 
the suburbs to the city. 

Example 5. Suppose that in Example 4 we are interested in the 
following kind of problem. Assume that there is a known number of 
Republicans and Democrats initially in the city and in the suburbs. 
Assume also that there is no changing of party affiliations and that 
the decision to move at any given time is independent of the party 
affiliations. What then is the probability that after a long time there 
will be a particular party division in the city? (We know from the 
previous example that eventually the situation becomes one of simply 
transferring a fixed number of people from the city to the suburbs 
and the same number from the suburbs to the city in each year.) We 
shall indicate how this problem can be solved for a very simple special 
case. 

Assume that there are two people in the city and eight in the sub¬ 
urbs when the equilibrium position is reached, and that each year 
one moves from the city to the suburbs and one from the suburbs to 
the city. Assume further that there are four Republicans and six 
Democrats in all. We form a Markov chain by taking as a state the 
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number of people in the city who are Republicans at any one time. 
Thus the states can be represented by the numbers 0,1, and 2. The 
transition probabilities can be calculated as follows. Assume, for ex¬ 
ample, that we are in state 1. 

Then the situation is represented 
in Figure 7. 

To go from state 1 to state 0 
we must choose a Republican from 
the city and a Democrat from the 
suburbs. This happens with prob¬ 
ability = ts-. To go from state 
1 to state 1 we must choose either 
a Republican from each place or a Democrat from each place. The 
probability of this occurring is The other transition 

probabilities are calculated in a similar manner. We obtain the matrix 
of transition probabilities 



Suburbs 



0 1 2 
o /! § o\ 
P = i A i A • 
2 \o f !/ 


Our matrix is a regular matrix, since its square has all positive 
elements, and t = (!,-&>A)- (See Exercise 7.) Thus, after a long time 
the probability that the city will have no Republicans is approxi¬ 
mately the probability for one Republican is and for two Re¬ 
publicans T V Again, these probabilities are independent of the initial 
number. 

These limiting probabilities are the same that one obtains if one 
chooses two people at random from a group of ten, of whom four are 
Republicans and six are Democrats, and asks for the probability of 
obtaining zero, one, or two Republicans. (See Exercise 8.) 

It is interesting to note that we ended up here with a chain like 
that in Example 3, which has been used by physicists as a crude 
model for diffusion of gases. 


Example 6. Suppose that you are confronted with two slot ma¬ 
chines each of which, when it pays off, pays off the same amount. 
You are told that machine A pays off each time with probability §, 
and that machine B pays off each time with probability J. If you are 
going to play one of the machines, you would naturally like to play A. 
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Unfortunately, you are not told which machine is which. We shall 
consider two systems which you might use and for each system find 
the fraction of times you can expect to win. 

As a first system we assume that you decide to use only the result 
of the last play to make your decision whether to switch mac hin es 
or to stay with the machine you played last time. For the first play 
we assume that you choose a machine at random. If you win on your 
first play, then the conditional probability that you have played ma¬ 
chine A is f. (See Chapter IV, Section 7, Example 3.) Hence, if you 
win on the first play, you stay with the same machine next time. If 
you lose on the first play, the conditional probability that you have 
played machine A is f. Hence, you would switch machines in this 
case. The assumption that you use only the last result makes every 
play look like the first play, and hence you always use the decision, 
play the same machine next time if you won last time, and switch 
machines if you lost. 

To find how you will fare under this system we form a Markov 
chain by taking as states the machines A and B. If you play machine 
A, the probability that you play A next time is i.e., the probability 
that you win. The other transition probabilities are found similarly, 
and we have the matrix of transition probabilities 


A B 



The fixed vector of this matrix (f,f) gives the probability after a 
number of plays that you will play each of the machines. Thus we 
see that, by this system, you can expect to play machine A about f of 
the time. Hence you will win approximately f*| + f*i = fof the 
time. 

As a second system we assume that you use the outcome of the 
last two plays to determine whether to switch or not. We assume that 
after each decision you make two plays and on the basis of the out¬ 
come of these two plays make your next decision. In this case the 
conditional probability that you have machine A, given that you have 
made two plays resulting in win-win, is greater than J. This prob¬ 
ability is also greater than J for the case that plays resulted in win-lose 
or lose-win. (See Exercise 10.) Hence in each of these cases you stay 
with the same machine for your next two plays. In the case of lose- 
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lose, the probability that you have machine A is less than and you 
switch machines. Under this system of play the matrix of transition 
probabilities is 

A B 



where A means playing machine A twice. The fixed vector for this 
matrix is (t^ts-). Thus you will by this system play machine A about 

of the time, and you will win a fraction of the time approximately 
equal to xs + xs-i = H = .42. Thus the more complicated system 
has improved your expectation by only .02. The best that any system 
could insure is an expectation of 

EXERCISES 

1. The land of Oz is blessed by many things, but not good weather. 
They never have two nice days in a row. If they have a nice day they are just 
as likely to have snow as rain the next day. If they have snow (or rain), 
they have an even chance of having the same the next day. If there is a 
change from snow or rain, only half of the time is this a change to a nice day. 
Set up a three-state Markov chain to describe this situation. Find the long- 
range probability for rain, for snow, and for a nice day. What fraction of the 
days does it rain in the land of Oz? 

[. Ans . The probabilities are: nice, rain, f; snow, %.] 

2. In Example 2, assume that a = i, b = c = i, and d = }. Find the 
fixed vector. What proportion of future elections can be expected to be 
Republican victories under these assumptions? 

3. In Example 2, assume that a = 1 and 6 = 1 and c = d, and that c is 

neither 0 nor 1. Show that the matrix is regular. Find the fixed vector. 
What limitation do our assumptions put on possible sequences of election 
outcomes? [Ans. t = (J, J, £).] 

4. In Example 2, assume that a = 0 and 6, c, and d are different from 
0 or 1. Is the matrix regular? Show that (1,0,0,0) is a fixed vector. Interpret 
this vector in terms of long-range predictions. 

5. In Example 2, assume that a = 0, d = 0, and b and c are different 
from 1 or 0. What can be said in this case about the nature of the political 
system after a long time? 

[Ans. From some time on. the party in power remains in power.] 
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6. In Example 3, find the matrix of transition probabilities for the case 
n = 2, and show that the fixed vector agrees with the result guessed on in¬ 
tuitive grounds in the discussion of the example. 

7. In Example 5, check that the fixed probability vector is (4 ,t 8 5,tW* 

8. Assume that from a group of ten people, four of whom are Republicans 
and six are Democrats, two are chosen at random. Find the probability that 
the two are both Republicans, that one is Republican and one is a Democrat, 
and that both are Democrats. 

9. Assume that a certain salesman always goes from city A to city B, 

and always from city B to city C. However, from city C he goes with prob¬ 
ability \ to city A and with probability \ to city B. Form a Markov chain 
to represent his travels. Is the matrix of transition probabilities regular? If 
so, find the fixed vector. [Ans. Yes. t = 

10. Referring to the second system in Example 6 , find the probability 
that the player has chosen machine A, given that the first two outcomes were 
win-win. Answer the same question for each of the other three possibilities 
from the first two outcomes. 

[Ans. Win-win, f; win-lose, f; lose-win, f; lose-lose, tV] 

11. In Example 6 assume that the player decides after every three plays 
which machine to play for the next three plays. He changes machines if and 
only if the conditional probability that he has machine A, given the last 
three outcomes, is less than one-half. Find the fraction of time that the 
player can expect to win using this system. Is this a better system than that 
based on only the last two times, given in Example 6? [Ans. 0.41; no.] 

12. Show that the vector given in Example 2 is the fixed vector of the 
matrix of transition probabilities. 

13. In Chapter IY, Section 13, Exercise 10, find the fixed point probability 
vector, and interpret it. 

14. A professor tries not to be late for class too often. If he is late one day, 
he is 90 per cent sure to be on time next time. If he is on time, then the next 
day there is a 30 per cent chance of his being late. In the long run, how often 
is he late for class? 

15. A professor has three pet questions, one of which occurs on every test 

he gives. The students know his habits well. He never uses the same question 
twice in a row. If he used question one last time, he tosses a coin, and uses 
question two if a head comes up. If he used question two, he tosses two coins 
and switches to question three if both come up heads. If he used question 
three, he tosses three coins and switches to question one if all three come up 
heads. In the long run, which question does he use most often, and how 
frequently is it used? [Ans. Question two, 40 per cent of the time.] 
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*9. LINEAR FUNCTIONS AND TRANSFORMATIONS 

The primary use of vectors and matrices in science is the repre¬ 
sentation of several different quantities as a single one. For example, 
the demands on all the industries in the United States may be rep¬ 
resented by a row vector x. We have seen examples where such a 
vector is multiplied by a column vector y , giving the number x-y. 
The components of y could be the values of unit outputs of the various 
industries. Then x-y is the total monetary value of the demand on 
industries. 

This illustration is typical of much that we meet in the sciences. 
It has two fundamental properties. If the demand increases by a 
given factor k , then ( kx)-y = k(x-y), and hence the value increases 
by the same factor. And if we have two demand vectors x and x', 
then (x + x')-y = (x-y) + ( x r -y ), and hence their values are also 
added. 

Thus we see that y has the effect of assigning to each row vector 
x a number f(x), and has the two very simple properties, 

(i) f(kx) = kf(x) 

(ii) f(x + x') = f(x) + /(s') 

Such an assignment of a number to each row vector x we call a linear 
function . We have seen that each column vector with n components 
defines a linear function for row vectors with n components. 

Linear functions represent the simplest type of dependence. Fortu¬ 
nately, very many problems can be represented at least approximately 
by linear functions. While it is not strictly true that manufacturing: 
100 tons of steel costs ten times as much as manufacturing 10 tons, 
this is at least a reasonable approximation. And the same holds for 
necessary raw materials, for labor needed, transportation costs, etc. 
Linear functions are so simple to handle that we try to use them 
whenever this is reasonable. 

Not only is it true that every column vector represents a linear 
function, but every linear function of row vectors can be so repre¬ 
sented. We will prove this for linear functions of three-component 
row vectors. 

Let us suppose that / assigns a number f(x) to each three-component 
vector x , and that it has the properties (i) and (ii). Consider the three 
special vectors, 
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e\ — (1,0,0), e 2 = (0,1,0), 63 = (0,0,1). 

M 

Let us call /(d) = y h and let y 2 = f(e 2 ), yz = /(c 3 ). And let y = I ?/ 2 ]. 

\yj 

If x = (x h x 2 ,xz), we can write x = Ztfi + x 2 e 2 + xzez. Hence, using 
properties (i) and (ii), we see that 

fix) = fix 161 + x 2 e 2 + X 3 C 3 ) 

= /(dCi) + f(x 2 e 2 ) + f{x z e z ) 

= xj(e 1 ) + a*/(e 2 ) + xzf(e z ) 

= Z 12/1 + x 2 y 2 + xzyz = x-y. 

Hence the column vector y represents the linear function /. 

Example 1. An office buys three kinds of paper, heavy bond, light 
bond, and a cheaper quality for intraoffice use. The amounts bought 
(in reams) are given by the row vector x — (20,50,70). The prices 
per ream of these types of paper are given (in cents) by the column 

/ 160 \ 

vector y — I 140 1. Then/(a;) = x-y = $186.00 is the cost of the or- 
\ 120 / 

der. So far y defines a linear function of x . It is customary to give 
a discount if 100 or more reams are ordered of one item. The new 
rules for computing the bill define a new function of x , different from/. 
Let us call the new function by the letter g. Then g(2x) < 2g(x), 
since the office gets a discount on the light bond and on the cheaper 
paper. Now we have a function that is not linear. It often happens 
that a function in science is nearly linear for restricted values of the 
components, but not even roughly linear outside this range. 

Sometimes we assign, not a single quantity to a row vector, but 
several quantities. Then we say that the vector is transformed into 
another vector. We saw an example (in Section 3) where the row 
vector giving numbers of houses being built was transformed into a 
row vector giving amounts of raw materials. We say that the trans¬ 
formation is a linear transformation if each component in the resulting 
vector is a linear function of the given vector. 

It follows immediately from the definition of a linear transforma¬ 
tion that it can be represented by a set of column vectors, which can 
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also be written as a matrix. Conversely, every matrix defines a linear 
transformation of vectors. Finally, it follows from (i) and (ii) that a 
linear transformation T satisfies the properties TQcx) = hT(x) and 
T(x + x f ) = T(x) + T(x'), where x and x' are vectors and k is a 
number. 

Example 2. Let us suppose that the population of the United 
States is divided into five groups according to income. The compo¬ 
nents of the row vector x are the number of people in each bracket. 
Say xi people have an income of $100,000 or above, x 2 have incomes 
between $20,000 and $100,000, etc. If we know the average number 
of cars owned by men in a given income bracket, we can represent 
these five numbers as a column vector, and we get the number of 
privately owned cars as a linear function of x. Similarly we could get 
the number of yachts, privately owned houses, or television sets. 
Each of these four quantities is a linear function of x (at least approxi¬ 
mately) and each is represented by a five-component column vector 
whose entries are averages. Writing the four vectors together as a 
rectangular array, we get a 5 X 4 matrix. This is a linear transforma¬ 
tion transforming x into a four-component row vector, whose com¬ 
ponents are the total number of cars, yachts, houses, and television 
sets, respectively. 


EXERCISES 

1. x — (xi,X 2 ,Xz). Test each of the following functions of x as to whether 
it has properties (i) and (ii). 

(a) fix) — 3xi + x 2 — 2z 3 . [Ans. Linear.] 

(b) fix) = xix 2 x 3 . _ 

(c) fix) — VOn) 2 + On) 2 + On) 2 . [Ans. Not linear.] 

(d) f ix) = x 2 . 

2. x = ix h x 2 ). Test each of the following transformations of x into y 
as to whether it is a linear transformation. 

(a) yi = 2xi + 3x 2 and y 2 = X\ — x 2 . [Ans. Linear.] 

(b) 2/1 = xi + 2x 2 and y 2 = — X\X 2 . [Ans. Not linear.] 

(c) 2/1 = x 2 and y 2 = —x u 

For the linear transformations above, write the matrix representing the 
transformation. 
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3. Prove that the function f(x) = c, where £ is a two-component row 
vector and c is a constant, is a linear function if and only if c = 0. 

4. Prove that the function f(x) — ax i + bx 2 + c, where £ is a two-com¬ 
ponent row rector and a, b, and c are constants, is a linear function if and only 
if c = 0. 

5. Prove that the transformation T(x) = xA + C, where x is a two- 
component row vector and A and C are 2X2 matrices, is a linear trans¬ 
formation if and only if C — 0. 

6. Prove that fix) = (least component of x) is not a linear function. 

7. Let # be a 12-component row vector. Its components are the enroll¬ 
ment figures in twelve mathematics courses. Give an example of 

(a) A linear function of x. 

[An Ans. The total enrollment in all mathematics courses.] 

(b) A linear transformation of x. 

(c) A nonlinear function of x . 

8. Let the components of x be the number of fiction books, the number 
of nonfiction books, and the number of other publications in a library. 
For each of the following functions, state whether or not it is a linear function 
of x. 

(a) The total number of publications. [Ans. Linear.] 

(b) The total number of cards in the catalogue. (Assume that each 
book has two cards, each other publication has one.) 

9. If in (i) and (ii), x is taken as a column vector, then the conditions 
define a linear function of a column vector. How can we represent such a 
function? How can we represent a linear transformation of column vectors? 

10. Show that the matrix R defined in Section 3 can be thought as a 
transformation of both row vectors and column vectors. 

*10. PERMUTATION MATRICES 

In Chapter III we defined a permutation of n objects to be an 
arrangement of these objects in a definite order. Thus the set {a,b,c} 
has six permutations: abc, acb, bac, bca, cab, and cba. There is a 
slightly different way of thinking of a permutation. We may think 
of our set as given originally in a definite order, say abc, and then think 
of a permutation as a rearrangement of the set. Thus one permutation 
changes abc into bac; i.e., the first element is put into the second spot, 
the second into the first spot, and the third element is left unchanged. 
In order to arrive at the same number, n!, of permutations as before, 
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we must consider the “rearrangement” that changes nothing, i.e., the 
permutation that “changes” abc into abc. We shall consider our n 
objects as components of a row vector. A permutation changes the 
row vector into another having the same components, but possibly 
in a different order. 

A convenient way to describe permutations is by means of certain 
special matrices. For example, the rearrangement given above can be 
described by the product 

/° 1 °\ 

(x h x 2 ,x z )l 1 0 0 1 = (Z 2 ,xi,xt). 

\0 0 1 / 

In this we do not have to think of the Xi as numbers. They are objects 
of any sort for which multiplication by 0 and 1 and addition is defined 
as for numbers. The 3X3 matrix then represents our permutation. 
It has only 0’s and l’s as components, and there is exactly one 1 in 
each row and in each column. 


Definition 1. A permutation matrix is a square matrix having 
exactly one 1 in each row and each column, and having 0’s in all other 
places. 


Examples of permutation matrices are shown in Figure 8. Since 
these matrices are square matrices ( n X n), we can speak of the ma¬ 
trix as having degree n. Thus Figure 8 shows one matrix of degree 2, 
two of degree 3, and one of degree 4. 

' - (? i> 


Figure 8 


B = 


D = 



<1 0 0 ) 


C = 0 1 


^0 0 1 j 


/'O 1 0 0\ 
10 0 0 
0 0 10 
Vo o o ly 


Theorem 1. Every permutation matrix of degree n represents a 
permutation of n objects, and every such permutation has a unique 
matrix representation. 
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Let us consider n objects xi, x 2 , . . . , x n which by a permutation 
are rearranged to give yi, y 2 , ... , y n - Here each of the y’s is one of 
the x’s, and every x is some y. If it happens that yj = x iy then the 
object in the ith. position was changed to the jth position. In this 
case, define pa = 1 and Pik = 0 for k i. Doing this for every i, we 
obtain an n X n permutation matrix P such that 


(1) (xi,X 2 , . . ., x n )P = (y h y 2 , . . . ,y n ). 


The fact that no two elements of a single row or a single column of P 
are 1 (i.e., that P is a permutation matrix) follows from the fact that 
in a permutation each element appears once and only once in the 
rearrangement. 

On the other hand, if we are given a permutation matrix P, then 
we can define a permutation by the product ( 1 ). The fact that each 
column of P has exactly one 1 means that each yj is some X{. The fact 
that P has only one 1 in each row means that every Xi appears as 
only one yj. Hence the vector (y h y 2y . . ., y n ) does represent a re¬ 
arrangement of the vector (xi,x 2 , . . . , x n ), completing the proof of 
the theorem. 

We shall restrict ourselves to the case of n = 4 for illustrating the 
following discussion, but all the results we are about to establish will 
hold for every n. In Figure 9 we find four examples of permutation 
matrices of degree 4. 


I = 


K = 


/l 0 0 0\ /0 1 0 0\ 

[0100], J = [ 0 0 0 1 I 


0 0 10 

\0 0 0 l/ 

/0 1 0 0\ 
10 0 0 
0 0 0 
V0 0 1 


L = 


0 0 10 
1 0 0 0 / 

/0 0 0 1 \ 
10 0 0 
0 0 10 
\0 1 0 0 / 


Figure 9 


We want to study the product of two permutation matrices 
of degree 4. If x = (xi,^ 2 ,x 3 ,x 4 ), then xJ = (xi,xi,xz,x 2 ) and 
xK = (^ 2 ,^ 1 ,£ 4 , 23 ). The former puts the first component into second 
place, the second component into fourth place, and the fourth com¬ 
ponent into first place; leaving the third component unchanged. The 
latter interchanges the first two and the last two. What happens if 
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we perform the two permutations, one after the other? Let us first 
consider Xi. In the first transformation it is changed into the second 
component, while in the second transformation, the second component 
is changed into the first. Hence x x ends up where it started, in first 
place. The component x 2 is first sent into the number four slot, and 
then this is changed to number three by the second transformation. 
Hence x 2 ends up as the third component. Component x z is at first 
not changed, but later changed into component four. Component x± 
is first made into the first component, and in the second transforma¬ 
tion it is changed into the second component. Hence, starting with x, 
after two transformations we end up with (x\,xi,x 2y xz ). 

Let us now consider the product 


JK 


'l 0 0 0\ 
0 0 10 
0 0 0 1 
s 0 1 0 0 / 


The matrix JK is again a permutation matrix, and it is easy to check 
that it represents precisely the permutation described above. 

Theorem 2. The product JK oi two permutation matrices of the 
same degree is again such a permutation matrix. It represents the 
result of first performing permutation J, then permutation K. 

This theorem is very easy to prove in matrix form. We wish to 
know what x(JK) is. By the associative law (see Section 4) this is 
the same as (xJ)K. But xJ is the result of the J permutation, and 
(xJ)K is the result of applying the K permutation to xJ . This proves 
the theorem. 


Example. Referring to Figure 9 let us consider the products IJ 
and JL We know, of course, that IJ — JI = J. Hence Theorem 2 
tells us that performing the I permutation followed by the J permuta¬ 
tion (or the reverse) will result simply in the J permutation. If we 
note that the I permutation leaves everything unchanged, this result 
is obvious. 

Let us now consider the product JL , where again J and L are as 
in Figure 9. The product is equal to I ; hence L = J~ l . By Theorem 2 
we know that the permutation J followed by L will result in the 
permutation 7, i.e., in no change at all. Thus we see that L = J~ l 
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is a permutation that undoes all changes made by J. We also note 
a similarity in the structure of J and L ; the latter is formed from the 
former by turning it over its main diagonal (the diagonal slanting 
from the upper left-hand comer to the lower right-hand comer). In 
other words L has as its i,jth component what J has as its j,ith com¬ 
ponent. 

Definition 2 . The transpose A * of a square matrix A is formed by 
turning it over its main diagonal; that is, the entries of A* are given 
by 4 = %• 

Theorem 3. If P is a permutation matrix, then P* is its inverse; 
that is, P* represents the permutation which undoes what the per¬ 
mutation P does. 

We must show that P* undoes what P does; the remainder will 
follow from the above discussion and Section 6. Let us suppose that 
p*j = 1. Then p# = 1; hence the permutation P moves component 
xj into position i . But then, because 4=1, the component is moved 
from position i into position j. Hence Xj ends up in position j, where 
it started; and this holds for every component. Thus P* undoes the 
work of P, which proves the theorem. 

Definition 3. A set of objects forms a group (with respect to mul¬ 
tiplication) if: 

(i) The product of two elements of the set is always an element 
of the set. 

(ii) There is in the set an element 7, called the identity element, 
such that for every A in the set, IA = AI — A. 

(iii) For every A in the set there is an element A* 1 in the set such 
that AA- 1 = A~ l A = 7. 

(iv) For every A, B, C in the set, A(BC ) = (AB)C. 

Definition 4. A set of objects form a commutative group if in 
addition to the above four properties they also satisfy: 

(v) For every A and B in the set, AB = BA. 

Theorem 4. The permutation matrices of degree n form a group 
(with respect to matrix multiplication), but this group is not com¬ 
mutative if n > 2. 
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Proof. Property (i) was shown in Theorem 2. Property (ii) follows 
from the more general fact that IM = MI = M, for every n X n 
matrix M. From Theorem 3 we know that A has an inverse, namely 
A~ l = A*. It is easy to show that A* is again a permutation matrix 
(see Exercise 1). Hence (iii) follows. And (iv) again follows from the 
more general theorem that all matrices obey this associative law. (See 
Section 6.) On the other hand it is easy to show examples, for any 
n > 2, where AB ^ BA. (See Exercises 2-3.) This completes the 
proof. 

The group formed by the n X n permutation matrices is known as 
the 'permutation group of degree n . Since permutations are used in the 
study of symmetry, this group is also called the symmetric group of 
degree n . 


EXERCISES 


1. Prove that the transpose of a permutation matrix is a permutation 
matrix; i.e., that if A satisfies Definition 1, then so does A*. 

2. Write all permutation matrices of degree 1. Write all permutation 
matrices of degree 2. Show that these two groups are commutative. 

3. For n > 2, we can form the matrix A which only interchanges X\ and 
X 2 , and the matrix B which only interchanges X\ and x 3 . What permutations 
are performed by AB and by BA? Are these two the same? Use this fact 
to show that the permutation group of order n > 2 is not commutative. 


4. Write down the permutation matrices which change (xi,X 2 ,Xz,x 4 ) into: 


(a) (x2,x z ,x 4 ,xi). 

(b) (x h x h X2,x 4 ). 

(C) (X2 f Xz,Xi,X 4 ). 

(d) (x h x 2 ,x 3 ,x 4 ). 


[Ans. (a) 


'0 0 0 1\ 

1 0 0 01 ] 
0 1 0 01 
^0 0 10 / 


5. For the following pairs of matrices, find the permutations they repre¬ 
sent. In each case show that AB represents the permutation A followed by 
the permutation B, and that BA represents the permutation B followed by 
the permutation A. 

/0 1 0\ /l 0 0\ 

(a) A = 0 0 1 , £=001], 

\1 0 0 / \0 1 0 / 
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(0 1 0\ /o 0 l\ 

(b) A = 0 0 1 , 5=100, 

\l 0 0/ \0 1 0/ 

0 1 0 0\ /0 1 0 0\ 

00011 r = [ 1 0 0 0 I 

iooo r loooil* 

0 0 10/ \0 0 1 0/ 

[Ans. (a) xA is (x z ,x h x 2 ); xB is (x h x s ,x 2 ); xAB is (x z ,x 2 ,xi) ; rr5A is 

6. Prove that the set of all 3 X 3 matrices does form a group (with 
respect to matrix multiplication). 

7. Find the inverses of the six matrices in Exercise 5 by using Theorem 3. 

Check your answers by multiplying the matrices by their inverses. 

/0 0 l\ /l o o\ 

" Arts . (a) A~ l = { 1 0 0 ]; B~ l = [ 0 0 1 }.] 

\0 10 / \° 1 0 / 

8. The process of division is usually introduced by saying that b/a is 
the solution of the equation ax ~ b (or of xa = b). 

(a) Prove that in a group the equation AX = B always has a unique 
solution. 

(b) Prove that in a group the equation XA = B always has a unique 
solution. 

(c) Show by means of an example that the two equations need not 
have the same solution. 

9. For the set of numbers {1,2,3,4} we define “multiplication” by means 
of the following table. 



(In this table we have neglected all multiples of 5; e.g., 2X4 = 8, but we 
neglected the 5 and just kept the remainder 3. Again 3X4= 12, but we 
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ignored the 10, which is a multiple of 5, and kept the remainder 2.) Prove 
that this set with multiplication so defined forms a commutative group. 

10. For the set {1,2,3,4,5,6} write down a multiplication table, ignoring all 
multiples of 7. (See Exercise 9.) Prove that the result is a commutative 
group. 

11. For the set {1,2,3,4,5} write down a multiplication table, ignoring 
multiples of 6. (See Exercises 9 and 10.) Prove that the result is not a group. 
Why do 5 and 7 give us groups, but not 6? 

12. Write down all permutation matrices of degree 3, and assign letter- 
names to them. Write a multiplication table for this group. How, from this 
table alone, can we see that properties (i), (ii), and (iii) hold? How do we 
see that (v) does not hold? 

*11. SUBGROUPS OF PERMUTATION GROUPS 

Within a group we sometimes can find smaller groups. Here we 
shall study some of the subgroups of permutation groups. It will be 
understood that whenever we speak of a group we have a set with a 
finite number of elements in mind. In particular this will be assumed 
for the theorems given below, since some of the theorems are not valid 
for groups with an infinite number of elements. The concept of a 
group has important applications for infinite sets, but these do not 
belong in this book. 

Definition 1. If a given set G forms a groilp, and some subset 
H of it also forms a group, we call the subset H a subgroup of G. If 
the subset H is a proper subset of G , we speak of a proper subgroup. 

Theorem 1. If we select any element of a group, the powers of 
the element form a subgroup which is commutative. 

Proof. Select any element A of the given group; we must show that 
the powers A n have the properties (i)-(v) given in the last section. 
The product of two powers is again a power, A ] A k = A J+k ; hence (i) 
holds. Next we observe that the powers cannot be all different, since 
this would give us infinitely many elements in our group. Hence we 
must have an equation A 1 ' = A k , with, say, j > k. However, this 
implies that A 3 ’~ k — I. Hence I occurs among the powers of A , say 
/ = A m . Therefore (ii) holds. If m = 1 or 2, then A is its own inverse 
(see Exercise 9). On the other hand, if m > 2, then among the powers 
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we find A m ~ l , and AA m ~ l = A m = I, so that A m ~ l is the inverse of A. 
This shows that property (iii) holds. The associative law (iv) follows 
from the fact that all matrices obey this law. Finally, we get com¬ 
mutativity (v) from the fact that A ] A k — A j+k = A k+3 ' = A k A\ com¬ 
pleting the proof. 

Definition 2. A group which consists of the powers of one ele¬ 
ment A is known as the cyclic group generated by A . 

Thus we know that we can form a cyclic subgroup of a given group 
by picking any one element A and taking all its powers. The number 
of elements in this subgroup is called the order of A. In the proof 
above, the order of A is the smallest possible m such that A m — I. 

Example 1. The permutation group of degree 4 has 4! = 24 ele¬ 
ments. Let us consider the cyclic subgroup generated by J (see 
Figure 9). We find that J 2 = L = J” 1 , so that J z = JJ 2 = I. Thus 
our cyclic subgroup consists of J, J 2 = L, and J 3 = I. If we continue 
to take higher powers, we get J 4 = J, J 5 = L, J 6 = 7, etc. The ele¬ 
ments are repeated in this fixed cycle. This is the source of the name 
‘“cyclic.” 

Example 2. We can get a larger cyclic subgroup by choosing the 
matrix M and its powers (see Figure 10); M has order 4; hence 
J17 -1 = Af* = il7 3 , and il7 4 = J. 



Theorem 2. If in a group we select any subset having property 
(i), then this subset is a subgroup. 

Proof. We must show that the subset also has properties (ii)-(iv). 
Let A be any element of the subset. By (i), A A = A 2 is also in the 
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subset, and then AA 2 = A z is in the subset, etc. Hence all powers 
of A are in the subset. One of these powers is 7 and one is A~ x . Hence 
we have properties (ii) and (iii). Property (iv) again follows from the 
fact that all matrices have this property, completing the proof of the 
theorem. 

We now have a practical way of finding subgroups. We select one 
or more elements of the group, and form all possible products of these, 
using each one as many times as necessary. If we form all possible 
products, then the product of any two products will also be on our 
list, and hence property (i) holds. Then, by Theorem 2, we have a 
subgroup, which is called the subgroup generated by the elements. If 
we start with a single element, we obtain a cyclic subgroup. Some 
very interesting subgroups can be generated by two elements. 

Example 3. Let us start with J (see Figure 9), and D (see Figure 
8), and form the subgroup they generate. First of all we get the 
powers of J, namely, J and J 2 = L and J 3 = 7, as was shown in 
Example 1. Then we have D , and 2) 2 , which is again 7. In prod¬ 
ucts formed using both J and D we need consider only J and J 2 
and D, since the next higher power is 7, and then the powers are 
repeated. Theoretically we should consider products like DJDJ 2 and 
JDJDJDJ , but we can show as follows that such long products give 
nothing new. First we observe that DJ — J 2 D , so that in a long 
product we may always replace DJ by J 2 7), and thus put all the J 1 s 
in front and all the D’s at the end. (See Exercise 14.) Therefore the 
only new products that we need consider are of the form J a D b ; and 
since J can occur only to the first or second power and D only to 
the first power, we arrive at JD and J 2 D as the only additional prod¬ 
ucts. Hence our subgroup has six elements: J, J 2 , 7), 7, JD , and J 2 7>. 
Since JD ^ DJ, the subgroup is not commutative. 

So far we have found subgroups of 3, 4, and 6 elements. Each of 
these numbers is a divisor of 24, the total number of elements in the 
group. It can be shown that the number of elements in a subgroup 
is always a divisor of the number of elements in the group, but we 
will not prove that fact here. 

Example 4. Let us now form the subgroup generated by D and 
K. Since D 2 = 7 = K 2 , both D and K will occur only to the first 
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power in a product. Furthermore DK — KD; hence the subgroup 
will have only four elements: I, D , K, DK. This subgroup is com¬ 
mutative. The fact that the subgroup happens to be commutative 
is a consequence of the following theorem. 

Theorem 3. If A and B commute (i.e., AB = BA) then any two 
products formed from A and B also commute. Hence the subgroup 
generated by A and B is a commutative subgroup. 

Proof. Given any product formed from A and B , say AABBBABAB, 
we can make use of the fact that AB = BA to move all the A 7 s up 
front and all the B’s to the end. Hence the product can be written 
A l BK A second such product can be written A k B m . The product of 
these, A i B 1 A k B m y can again be rearranged so that all the A 7 s (come 
at the beginning. Hence (A i B ] ‘)(A k B m ) = A i+k B 3+m = A k+i B m+3 ' = 
(A k B m )(A i B 3 ), completing the proof. 

We have now found two types of commutative subgroups: (1) cyclic 
subgroups and (2) subgroups generated by two elements that com¬ 
mute. For the latter it is convenient to have a technique for finding 
two commuting elements. We will develop one method for finding 
such pairs. 

Definition 3. The effective set of a permutation matrix is the set 
of all those components of the row vector which are changed by the 
matrix. 

For example, D has {xi,x 2 } as its effective set, J has {xx y X 2 y Xi} y 
K has the set of all four components, and I has the empty set as effec¬ 
tive set. K suggests the definition: 

Definition 4. A permutation matrix having all the components 
in its effective set is called a complete permutation matrix. 

Theorem 4. Two permutation matrices, whose effective sets are 
disjoint, commute. 

Proof . Let Ax have Xi as its effective set, and A 2 have X 2y so that 
Ii H I 2 = 8. Then AiA 2 will make some changes on Xx and then 
on A" 2 . The latter are not affected by the former, since Xx and X 2 have 
nothing in common. Thus we get the same result if we perform A 2 
followed by Ax* 
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We now have a simple way of getting a commutative subgroup, 
other than a cyclic one. Just select any two matrices (other than I) 
with disjoint effective sets, and form the subgroup that they generate. 

EXERCISES 

1. Write down the six permutation matrices of degree 3. 

2. Form the cyclic subgroup for each of the six matrices in Exercise 1. 
Are these subgroups all different? What is the order of each matrix? 

[Ans. Five distinct groups; one of order 1; three of order 2; two of 

order 3.] 

3. Prove that there are no proper subgroups of the permutation group 
of degree 3, other than those found in Exercise 2. 

4. Write the 24 permutation matrices of degree 4. 

5. Form the cyclic subgroup for each of the matrices in Exercise 4. How 
many different ones do you get? What is the order of each matrix? 

[. Ans . 17 distinct groups; one of order 1; nine of order 2; eight of order 

3; six of order 4.] 

6. Show by an example that the subgroups found in Exercise 5 are not 
the only proper subgroups of the permutation group of degree 4. 

7. Prove the following facts about orders of permutations: 

(a) I has order 1. 

(b) A permutation which does nothing but interchange one or more 
pairs of elements has order 2. 

(c) Every other permutation has an order greater than 2. 

8. Prove that the subgroup generated by A and B is cyclic if and only if 
one generator is a power of the other. 

9. Prove that if a matrix has order 1 or 2, then it is its own inverse. 

10. A matrix M is said to be symmetric if mu = mu for all i and j. Prove 
that a permutation matrix is symmetric if and only if it has order 1 or 2. 

11. Form the subgroup generated by J and K. 

[Ans. There are 12 elements.] 

12. Prove the following facts about effective sets: 

(a) I has an effective set of zero elements. 

(b) A matrix which simply interchanges two elements has as effective 
set a set of two elements. 

(c) All other matrices have an effective set of at least three elements. 
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(d) A matrix is complete if and only if the number of elements in its 
effective set equals its degree. 

13. We wish to form a commutative subgroup of the permutation group 
of degree 4, by means of the method described above. We want to choose 
two matrices (other than I) with disjoint effective sets, and form the sub¬ 
group they generate. 

(a) Using the results of Exercise 12, what must the number of elements 

be in the two effective sets? [Arts. 2; 2.] 

(b) Choose such a pair of matrices. 

(c) Form the subgroup. 

14. Prove the following facts about Example 3 above. 

(a) DJ = J 2 D. 

(b) From this it follows that DJ 2 = JD. 

(c) In any product of D’s and J’s we can put all the J’s up front. 

15. If A has order m, and m is an even number, then A ml2 is its own 
inverse. Prove this fact. What does this say about an element of order 2? 

16. Prove that the cyclic group generated by A 2 is a subgroup of that 
generated by A. When will this be a proper subgroup? 
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Chapter VI* 


LINEAR PROGRAMMING AND 
THE THEORY OF GAMES 


t. CONVEX SETS 

By the locus of a linear equation of the form ax + by = c we mean 
the set of all points whose coordinates (x,y) satisfy the equation. For 
example, the locus of the equation 

(a) 2x + 3y = 6 

can be found, by trial and error, to be the straight line plotted in 
Figure 1. Thus, setting x = 0 we get y = 2, so that the point (0,2) 
is on the locus; similarly, x = 1 gives y = -f, so that (1,-J) is on the 
locus; in the same way the point (3,0) is on the locus; etc. We now 
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wish to consider the loci of inequalities in the variables x and y. Con¬ 
sider 

(b) 2x + 3 y <6 

as an example. What points (x,y) satisfy this inequality? By trial 
and error we can find many points on the locus. Thus (1,1) is on it 
since 2-l+3-l=5<6;on the other hand (1,2) is not on the locus 
since 2-1 +3*2 = 8 , which is not less than 6 . In between these two 
points we find (1,-f) which lies on the boundary, i.e., on the locus of (a). 
We note that by increasing y we went outside the locus; by decreasing 
y we came into the locus. This holds in general. Given a point on 

(a) , increasing y will give us more than 6, decreasing y gives us less 
than 6, and hence the latter is on (b). We find that the locus of (b) 
consists of all points below the line (a). This is the shaded area in 
Figure 1. The area on one side of a straight line is called an open half 
plane . 

We can apply exactly the same analysis to 

(c) 2x + 3y > 6 

to see that its locus is the open half plane above the line (a). This 
can also be deduced from the fact that the loci (a), (b), and (c) are 
disjoint and that their union is the entire plane. 

If we have an inequality of the form 2x + Zy < 6, its locus will 
consist of the union of the (a) and (b) loci; hence it consists of an 
open half plane together with its boundary, which we call a closed half 
plane . The same type of analysis shows that as ax + by = c always 
has a straight line as its locus, ax + by < c has an open half plane 
and ax + by < c has a closed half plane for a locus. 

We will discuss an alternate interpretation of loci, which makes 
some of the above considerations clearer. An equation or inequality 
in x and y may be thought of as a statement whose truth or falsity 
depends on what x and y are, or on what the point (x,y) is. Thus each 
point of the plane represents one logical possibility, and the entire 
plane may be thought of as the set of all logical possibilities. Then 
the so-called locus is simply the truth set of the statement. Since (a), 

(b) , and (c) are a complete set of alternatives (see Chapter I, Section 
8) their truth sets are disjoint and have c ll as their union (see Chapter 
III, Section 1); hence they form a partition of U. The statement 
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2x + 3y < 6 is equivalent to (a) V (b), and hence its truth set is the 
union of the truth sets of (a) and (b). 

Suppose now that we consider a system of inequalities, that is a 
set of two or more inequalities, and seek simultaneous solutions to 


them. 

For example, consider the system 

(d) 

x > 0 

(e) 

y > o 

(f) 

2x + 3y < 6. 


Here we are asserting three different statements, i.e., we assert their 
conjunction. Thus the truth set 
is the intersection of the three 
individual truth sets. We al¬ 
ready know that the locus of (f) 
is the closed half plane shaded 
in Figure 1. The locus of (d) 
is the right-hand closed half 
plane, while (e) has the upper 
closed half plane as locus. The 
intersection of these is the 
triangle (including the sides) 
shaded in Figure 2. This area contains all points which satisfy the 
system of inequalities. 

Definition. The intersection of closed half planes is called a po¬ 
lygonal convex set 

Theorem. The points which are simultaneous solutions of a sys¬ 
tem of inequalities of the < type form a polygonal convex set. 

This theorem follows from the fact that each inequality of the < 
(rather than <) type has a closed half plane as its truth set, and that 
the system has as its truth set the intersection of these half planes. 

EXERCISES 

1. Draw pictures of the polygonal convex sets which contain the simul¬ 
taneous solutions to the following systems of inequalities. (Construct the 
individual closed half planes first, and then take their intersection.) 
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(a) 

x < 3. 

(b) 

2* + 3y > 6. 


y <2. 


— x + y < 2. 


2x +3y> 0. 


» + y < 3. 

(c) 

2x + 3y < 6. 

(d) 

y>0. 


—x + y <2. 


x>0. 


x + y < 3. 


x< 2. 

(e) 

x < 2. 

(f) 

3x + 2y < —6. 


x > -2. 


3z + 2y < 6. 


y< 3. 




y> -3. 



(g) 

3x + 2y > 6. 

(h) 

a: — 2 / > 0. 


3x + 2y < 6. 


a: + 1/ < 0. 

(i) 

x < 2. 

(i) 

3x + 22/ > 6. 


* > 5. 


2a: + 3y > 6. 




x > 0. 




y > o. 

(k) 

2x + y > 7. 




x < 0. 




y < o. 




2. Consider the following sets: 

*11 is the whole plane; 

A is the half plane which is the locus of — 2x + y < 3. 

B is the half plane which is the locus of — 2x + y > 3. 

C is the half plane which is the locus of —2x + y < 3. 

D is the half plane which is the locus of — 2x + y > 3. 

L is the line which is the locus of — 2x + y = 3. 

8 is the empty set. 

Show that the following relationships hold among these sets: A — D, B — C } 
L = 1U B, C n D = L, A n B = 8, A C\ C = A, B H D = iU D 
= %J5VJC = c a, AUC = C, b\jd = d, a\jl = c, SVJL-D. 

Can you find other relationships? 

3. Of the polygonal convex sets constructed in Exercise 1, which have a 
finite area and which have infinite area? What is the boundary of those 
having finite area? 

[Arts, (c), (d), (f), (h), and (j) are of infinite area; (g) is a line.] 

4. For each of the following half planes give an inequality of which it is 
the truth set. 

(a) The open half plane above the z-axis. [Ans. y > 0.] 

(b) The closed half plane on and above the straight line making angles 
of 45° with the positive x- and y- axis. 
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Exercises 5-9 refer to a situation in which a family decides to buy x books 
and y record albums. The books cost $4 each, and the albums cost $3 each. 

5. One cannot buy a negative number of books or albums. Write these 
conditions as inequalities, and draw their truth sets. 

6. There are only six books and six albums that they like. Modify the 
set found in Exercise 5 to take this into account. 

7. They are not willing to spend more than $24 altogether. Modify the 
set found in Exercise 6. 

8. They decide to spend at least twice as much on books as on records. 
Modify the set of Exercise 7. 

9. Finally, they decide that they want to spend $9 on records. What 

possibilities are left? [Arts. None.] 

10. Assume that the following statements are true: Every human being 
needs at least .02 g of phosphorus per day. Every adult human needs .01 g 
of calcium, every child (not an infant) needs .03 g of calcium, and every 
infant needs .04 g of calcium. Plot the amount of phosphorus on the vertical 
axis and the amount of calcium on the horizontal. Then draw in the convex 
sets of minimal requirements for adults, infants, and children. State whether 
or not the following assertions are true. 

(a) An adult’s needs are fulfilled only if a child’s needs are. 

(b) If a child’s needs are satisfied, then so are an infant’s. 

(c) A child’s needs are satisfied only if an infant’s are. 

[. Ans . (a) True, (b) False, (c) False.] 

11 . Assume that the minimal requirements of human beings are given by 
the following table: 



Phosphorus 

Calcium 

Adult 

.02 

.01 

Child 

.03 

.03 

Infant 

! 

.01 

.02 


Plot the amount of phosphorus on the vertical axis and the amount of calcium 
on the horizontal. Then draw in the convex sets of minimal diet requirements 
for adults, children (noninfants), and infants. State whether or not the 
following assertions are true. 

(a) If a child’s needs are satisfied, so are an adult’s. 

(b) An infant’s needs are satisfied only if a child’s needs are. 

(c) An adult’s needs are satisfied only if an infant’s needs are. 
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(d) Both an adult's and an infant's needs are satisfied only if a child's 
needs are. 

(e) It is possible to satisfy adult needs without satisfying the needs 
of an infant. 

2. MAXIMA AND MINIMA OF LINEAR FUNCTIONS 

A polygonal convex set may have either a finite or an infinite area. 
Both these possibilities are illustrated in Exercise 1 of the previous 
section. A set of finite area, like that in Figure 2, consists of a polygon 
together with its interior. It will be convenient to refer to this entire 
area as a polygon. Hence all polygonal convex sets of finite area are 
polygons, according to this terminology. Furthermore, each such 
polygon is a convex set (see Exercise 8 of this section); hence we will 
refer to them as convex polygons. 

A polygon with n sides has n comers. For example, the triangle of 
Figure 2 has the three corners (0,0), (3,0), and (0,2). A corner is 
formed by the intersection of two lines; hence the corner of one of the 
polygons is the intersection of the boundaries of two of the half planes. 

We can now give an interpretation for the various points of a convex 
polygon in terms of the system of inequalities. A corner point lies on 
two boundaries, which means that two of the inequalities are actually 
equalities. A point on a side, other than a corner point, lies on one 
boundary and hence one inequality is an equality. An interior point 
of the polygon must, by a process of elimination, correspond to the 
case where the inequalities are all strict inequalities, i.e., not only < 
but < holds. 

The above description is for the usual case. Should three lines meet 
in a point, for example, or should there be some redundant inequali¬ 
ties, some obvious modifications would have to be made. 

Example 1. Consider the system (d)-(f) of the last section. The 
corner point (0,0) makes (d) and (e) equalities, i.e., x = 0 and y = 0. 
For the corner (0,2) we find that (d) and (f) are equalities, and for 
(3,0) the pair (e) and (f) turn into equalities. For boundary points 
other than corners, the side on the ?/-axis has (d) as an equality, the 
side on the x-axis has (e) as an equality, and (f) is an equality for the 
slanting side. For points in the interior of the triangle no equalities 
hold. 
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Example 2. Find the polygonal convex set defined by the following 
system of inequalities: 

2x + y + 9 > 0, 

— x + 3y + 6 > 0, 
x + 2y — 3 < 0. 

A rough sketch of the three half planes shows that the set is a triangle. 
Hence we can find the comer points by changing two of the inequali¬ 
ties to equalities and solving simultaneous equations. The first two 
yield (-3,-3), the first and third ( — 7,5), while the last two give us 
(^f-,—f). Hence the polygon is the triangle having these three corners. 

Example 3. Let us suppose that in a business problem x and y are 
quantities we can control, except that there are limitations imposed 
which can be stated as inequalities. We shall assume that the system 
of inequalities given in the previous example limits our choice of x 
and y. Let us assume that a given choice of x and y results in a profit 
of x + 2y dollars. What is the most and the least profit we can make? 
We must find the maximum and the minimum value of x + 2y for 
points (x y y) in the triangle. Let us first try the corners. At (—3,-3) 
we would have a profit of —9, i.e., a loss of $9. At ( — 7,5) we have a 
profit of $3, and at (^ L ,—f) also a profit of $3. What can we say 
about the remainder of the triangle? The last inequality tells us that 
x + 2y < 3, hence our profit cannot be more than $3. If we multiply 
the first inequality by f and the second by f and add them, we find 
that x + 2y > —9; hence we cannot lose more than $9. We have 
thus shown that both the greatest profit and the greatest loss occur 
at a corner point. We will show that this is true in general. 


Given a convex polygon and a function ax + by (thought of as a 
linear function of points in the plane), we want to show that the maxi¬ 
mum and minimum values of the function ax + by always are taken 
on at a corner point of the polygon. First we will show that the values 
of ax + by on any line segment lie between the values the function 
has at the two end points (possibly equal to the value at one end 
point). 

We represent the points as row vectors (x,y), and then we see that 
our function is the linear function represented by the column vector 

Let the end points of the segment be p = (x,y) and q = 
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We have seen in Chapter V (cf. Figure 4) that the points in between 
can be represented as tp + (1 — t)q, with 0 < t < 1. If the values of 
the function at the points p and q are A and B (assume that B < A), 
then at the point in between, the value will be tA + (1 — t)B , since 
the function is linear. This value is B + (A — B)t , which is at least 
B and at most A. 

We are now in a position to prove the result illustrated in Ex¬ 
ample 3. 

Theorem. A linear function defined over a convex polygon takes 
on its maximum (and minimum) value at a corner pointof the convex 
polygon. 

The proof of the theorem is illustrated in Figure 3. We shall sup¬ 
pose that at the corner p the function takes on its largest corner 
value, A , and at the corner q it takes on its smallest corner value, 13. 
Let r be any point of the polygon. Draw a straight line between p 

/-Minimum corner 
* value, B 
Q 



and r and continue it until it cuts the polygon again at a point u lying 
on an edge of the polygon, say the edge between the comer points 
s and t (The line may even cut the edge at one of the points s or t; 
the analysis remains unchanged.) By hypothesis the value of the 
function at any comer point must lie between B and A . By the above 
result the value of the function at u must lie between its values at 
s and t , and hence must also lie between B and A. Again by the above 
result the value of the function at r must he between its values at p 
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and u, and hence must also lie between B and A. Since r was any 
point of the polygon our theorem is proved. 

Suppose that in place of the linear function ax + by we had con¬ 
sidered the function ax + by + c. The addition of the constant c 
merely changes every value of the function, including the maximum 
and minimum values of the function, by that amount. Hence the 
analysis of where the maximum and minimum values of the function 
are taken on is unchanged. Therefore we have the following theorem. 

Theorem, The function ax + by + c defined over a convex poly¬ 
gon takes on its maximum (and minimum) value at a comer point 
of the convex polygon. 

The method of finding the maximum or minimum of the function 
ax + by + c over a convex polygon is then the following: Find the 
corner points of the set; there will be a finite number of them; sub¬ 
stitute the coordinates of each into the function; the largest of the 
values so obtained will be the maximum of the function and the 
smallest value will be the minimum of the function. The method is 
illustrated in Example 3 above. 

EXERCISES 

1. (a) Draw a picture of the convex polygon obtained in Example 2 

above. 

(b) Draw a picture of the convex set defined by the inequalities 

2x + y + 9 < 0, 

-x + By + 6 < 0, 

x + 2y - 3 < 0. 

(c) What is the relationship between the two figures? 

2. Find the comer points of the convex polygons given in parts (a), (b), 
and (e) of Exercise 1 following Section 1. 

[Am. (a) (3,-2), (3,2), (-3,2); (e) (2,3), (-2,3), (2,-3), (-2,-3).] 

3. (a) Show that the three lines whose equations are 

2x + y + 9 = 0 
—x + 3y + 6 = 0 
x + 2y - 3 = 0 
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divide the plane into seven convex regions. Mark these regions 
with roman numerals I-VII. 

(b) For each of the seven regions found in part (a), write a set of three 
inequalities, having the region as its locus. (Hint: Two of these 
sets of inequalities are considered in Exercise 1.) 

(c) There is one more way of putting inequality signs into the three 

equations given in (a). What is the locus of this last set of in¬ 
equalities? [Ans. The empty set 8.] 

4. (a) Find the maximum and minimum of the function 

f(x,y) = -2® + 5y + 17 

over each of the convex polygons given in parts (a), (b), and (e) 
of Exercise 1 following Section 1. [Ans. (a) 33, 1; (e) 36, —2.] 
(b) Find the minimum, when it exists, of the function 

f(x,y) = 5x + Sy — 6 

over each of the polygonal convex sets given in parts (h), (i), and 
(j) of Exercise 1 following Section 1. 

[Ans. (h) Neither maximum nor minimum; (j) Minimum is 3.] 

5. (a) Find the corner points of the convex polygon given by the equa¬ 

tions 

2x + y + 9 > 0, 

— x + 3 y + 6 > 0, 
x + 2y — 3 < 0, 
x + y <0. 

(Hint: use some of the results of Example 2 in the text above.) 
(b) Find the maximum and minimum of the function 

f(x,y) = 7x + 5y - 3 

over the convex polygon given in part (a). 

[Ans. Maximum, 0; minimum, —39.] 

6. A convex polygon has the points ( — 1,0), (3,4), (0,-3), and (1,6) as 
corner points. Find a set of inequalities which defines the convex polygon 
having these corner points. 

7. Consider the polygonal convex set P defined by the inequalities 

— 1 < x < 4 
0 < y < 6. 

Find four different sets of conditions on the constants a and b that the 
function f(x,y) — ax + by should have its maximum at one and only one 



Sec. 3] 


LINEAR PROGRAMMING 


259 


of the four corner points of P. Find conditions that f should have its mini¬ 
mum at each of these points. 

[Arts. For example, the maximum is at (4,6) if a > 0 and b > 0.] 

8. A set of points is said to be convex if whenever it contains two points 
it also contains the line segment connecting them. Show that: 

(a) If two points are in the truth set of an inequality, then any point 
on the connecting segment is also in the truth set. 

(b) Every polygonal convex set is a convex set in the above-mentioned 
sense. 

9. Give an example of a quadrilateral that is not a convex set. 

10. Prove that for any three vectors, p, q, r } the set of all points ap + bq 
+ cr (a > 0, b > 0, c > 0, a + b + c = 1) is a convex set. What geometric 
figure is this locus? [Ans. In general, the locus is a triangle.] 

3. LINEAR PROGRAMMING PROBLEMS 

An important class of practical problems are those which require 
the determination of the maximum or the minimum of a function of 
the form ax + by + c of a point defined over a convex set of points. 
We illustrate these so-called linear programming problems by means 
of the following series of examples. 

Example 1. An advertiser wishes to sponsor a television comedy 
half hour and must decide on the composition of the show. Let x be 
the number of minutes of commercial time and let y be the number 
of minutes the comedian appears. By their definition x and y are 
nonnegative variables. Assume that the advertiser insists that there 
be at least three minutes of commercials, while the television network 
insists that the commercial time be limited to at most fifteen minutes. 
Now the commercial time plus the comedian time must fill up the 
half hour, i.e., x + y = 30. The latter equation can be written as a 
pair of inequalities x + y > 30, and x + y < 30. The inequalities de¬ 
fining our problem now are 



x 

> 

3, 


X 

< 

15, 


y 

> 

o, 

X + 

y 

> 

30, 

X + 

y 

< 

30. 
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The “polygon” determined by these inequalities is the line segment 
shown in Figure 4. The comer points of the polygon are (3,27) and 
(15,15). 

The advertiser has two mo¬ 
tives: to minimize the cost of 
the show, and to maximize the 
number of people who see it. 
Suppose the comedian costs 
$200 per minute, and the com¬ 
mercials cost $50 per minute. 
Then the cost function is 

C = 50x + 200 y. 

Similarly, suppose that for every 
minute that the comedian is on 
the air 70,000 more people will 
commercial is on, one more per¬ 
son (e.g., a sponsor) will tune in. Then, if N is the total number of 
viewers, we have 

N = x + 70,000 y. 

The advertiser now wants to minimize C and maximize N . By sub¬ 
stituting in the coordinates of the corner points of the polygon we see 
that the point (15,15), i.e., 15 minutes of commercials and 15 minutes 
of comedy, gives the least cost of $3750 per show. On the other hand, 
the point (3,27), that is, 3 minutes of commercials and 27 minutes of 
comedy, gives the maximum number of viewers, namely, 1,890,003. 
These results are fairly obvious from a common-sense point of view. 

Example 2. Suppose now that the comedian (for lack of jokes) 
refuses to work more than 22 minutes each half hour show. To fill 
in the remaining time an orchestra is added to the show. Now we 
have x + y < 30 as the number of minutes that the commercials and 
comedian appear, and 30 — x — y as the number of minutes that the 
orchestra plays. Our inequalities now are: 

x > 3, 
x < 15, 

y>o, 
y <22, 

a: + 7/ < 30. 



tune in, and for every minute the 
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The polygon corresponding to these inequalities is shown in Figure 5. 
Here the set has the five comer points (3,0), (15,0), (15,15), (8,22), 
and (3,22). Suppose that the band costs $250 per minute; then the 
cost function is 


C = 50x + 200 y + 250(30 - x - y) 

= 7500 - 200# - 50 y. 

Here the minimum cost point is (15,15), giving a cost of $3750 per 
show. Since x + y = 30, we see that the band does not play in this 


Maximum viewer point if 
band has viewer appeal 


Minimum cost point 
if band is free 



Maximum viewer point if 
band has no viewer appeal 


Minimum cost point if band 
costs $250 per minute 


A line segment of minimum 
cost points if band costs 
same as comedian 

Minimum cost point if band 
costs $150 per minute 


Figure 5 


solution. Let us assume first of all that the band has no viewer appeal. 
Then the N function will be the same as in Example 1. The maximum 
viewer point will then be (8,22), which gives 1,540,008 viewers. Again 
x + y = 30, so that the band does not play. 

Suppose now that the band does have viewer appeal. To be specific, 
assume that 10,000 more people view the show for each additional 
minute the band plays. Then our N function becomes 

N = x + 70,000 y + 10,000(30 - z - y) 

= — 9,999x + 60,0002/ + 300,000. 

Here the maximum viewer point is (3,22), giving a maximum number 
of 1,590,003 viewers. Observe that in this solution the band plays 
five minutes during each show. 

Example 3. Our advertiser finds that the financing of the show is 
becoming difficult and wishes to drop it. To try to induce him to keep 
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the show the television network reduces the price of the band to $150 
per minute. Then the cost function becomes 

C = 5 0x + 200 y + 150(30 - x - y) 

= -100a; + 50 y + 4500. 

In this case the minimum cost point is (15,0), meaning that the 
comedian is dropped from the show, which now consists of 15 minutes 
of commercials and 15 minutes of band music. The minimum cost is 
now $3000 per show. 

Suppose that this cost is still too high for the advertiser, so that 
the network offers the band free providing the advertiser still pays 
for commercials. The cost function is 

C = 50.r + 200 y. 

Here the minimum cost point is (3,0), meaning that the program 
consists of 27 minutes of band music and 3 minutes of commercials. 
The minimum cost is $150 per show. 

Example 4. In Example 2 assume that the band and comedian 
each cost $200 per minute. Now the cost function is 

C = 5 0x + 200 y + 200(30 - x - y) 

= —150a: + 6000. 

Here something new happens, since both of the corner points (15,15) 
and (15,0) yield the minimum cost of $3750. Observe that the first 
point has 15 minutes of the comedian and no music, while the second 
has 15 minutes of music and no comedian. Which of these two solu¬ 
tions should we take? As far as cost goes it doesn’t matter, and both 
are equally good. In fact, if we let the comedian and band divide the 
15 minutes arbitrarily, this also is a solution giving minimum cost. 
Thus all the points (15 ,y) are possible minimum cost solutions where 
0 < y < 15. 

We have now discovered a general principle. Whenever two corner 
points each give the same value for our function, the entire connecting 
line segment also gives this value. This follows from the fact that the 
value on a line segment is always between the values at the end points 
(see the last section). Hence, if two corners bbth give the minimum 



Sec. 3] 


LINEAR PROGRAMMING 


263 


(or maximum) value of the function, so does the entire connecting 
segment. 

The above examples show that any one of the comer points or even 
any point on the whole line segment connecting two of them (and 
hence any point on the polygon) can be the solution to a linear pro¬ 
gramming problem depending upon what the facts are and what is 
desired. But the facts are completely known if we know the values 
at the comer points. 

The exercises below should be worked in the same way. First find 
the comer points of the convex polygon, then set up the function 
which is to be maximized or minimized, and then check to see which 
corner point or points solve the problem. 

EXERCISES 

1. In Example 2, assume that the comedian and band always cost more 
than the commercials. Then show that, if the advertiser wishes to minimize 
cost, the following statements are true. 

(a) If the band costs more than the comedian, the band should be 
dropped. 

(b) If the comedian costs more than the band, the comedian should 
be dropped. 

(c) If the comedian and band cost the same, the minimum cost point 
solution gives 15 minutes of commercials with the remaining 15 
minutes being shared in any proportion between the comedian 
and the band. 

2. A well-known nursery rhyme says “Jack Sprat could eat no fat. His 
wife (call her Jill) could eat no lean. . . .” Suppose Jack wishes to have at 
least one pound of lean meat per day, while Jill needs at least .4 pound of 
fat per day. Assume they buy only beef having 10 per cent fat and 90 per 
cent lean, and pork having 40 per cent fat and 60 per cent lean. Jack and 
Jill want to fulfill their minimal diet requirements at the lowest possible cost. 

(a) Let x be the amount of beef and y the amount of pork which they 
purchase per day. Construct the convex set of points in the plane 
representing purchases that fulfill both persons’ minimum diet 
requirements. 

(b) Suggest necessary restrictions on the purchases, that will change 
this set into a convex polygon. 

(c) If beef costs $1 per pound, and pork costs 50 cents per pound, 

show that the diet of least cost has only pork, and find the mini¬ 
mum cost. [ Arts . $.83.] 
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(d) If beef costs 75 cents and pork costs 50 cents per pound, show 

that there is a whole line segment of solution points and find the 
minimum cost. [Ans. $.83.] 

(e) If beef and pork each cost $1 a pound, show that the unique 

minimal cost diet has both beef and pork. Find the minimum 
cost. [Ans. $1.40.] 

(f) Show that the restriction made in part (b) did not alter the 
answers given in (c)-(e). 

3. In Exercise 2(d) show that for all but one of the minimal cost diets 
Jill has more than her minimum requirement of fat, while Jack always gets 
exactly his minimal requirement of lean. Show that all but one of the minimal 
cost diets contains some beef. 

4. In Exercise 2(e) show that Jack and Jill each get exactly their minimal 
requirements. 

5. In Exercise 2, if the price of pork is fixed at $1 a pound, how low must 
the price of beef fall before Jack and Jill will eat only beef? [Ans. $.25.] 

6. In Exercise 2, suppose that Jack decides to reduce his minimal require¬ 
ment to 0.6 pound of lean meat per day. How does the convex set change? 
How do the solutions in 2(c), (d), and (e) change? 

7. A poultry farmer raises chickens, ducks, and turkeys and has room 
for 500 birds on his farm. While he is willing to have a total of 500 birds, he 
does not want more than 300 ducks on his farm at any one time. Suppose 
that a chicken costs $1.50, a duck $1.00, and a turkey $4.00 to raise to 
maturity. Assume that the farmer can sell chickens for $3.00, ducks for 
$2.00, and turkeys for T dollars each. He wants to decide which kind of 
poultry to raise in order to maximize his profit. 

(a) Let x be the number of chickens and y be the number of ducks he 
will raise. Then 500 — x — y is the number of turkeys he raises. 
What is the convex set of possible values of x and y which satisfy 
the above restrictions? 

(b) Find the expression for the cost of raising x chickens, y ducks, and 
(500 — x — y) turkeys. Find the expression for the total amount 
he gets for these birds. Compute the profit which he would make 
under these circumstances. 

(c) If T — $6.00, show that to obtain maximal profit the farmer 
should raise only turkeys. What is the maximum profit? 

[Ans. $1000.] 

(d) If T = $5.00, show that he should raise only chickens and find 

his maximum profit. [Ans. $750.] 

(e) If T — $5.50, show that he can raise any combination of chickens 

and turkeys and find his maximum profit. [Ans. $750.] 
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8. Rework Exercise 7 if the price of chickens drops to $2.00 and T is 

(a) $6.00, (b) $5.00, (c) $4.50, and (d) $4.00. 

9. In Exercise 7 show that if the price of turkeys drops below $5.50, the 
farmer should raise only chickens. Also show that if the price is above $5.50, 
he should raise only turkeys. 

10. Let / be a linear function of points (x,y) where (x y y) is a probability 
vector. 

(a) Write the restrictions on x and y. 

(b) Find the truth set of this system of conditions. 

(c) For what probability vectors could / possibly be a maximum? 

[Ans. (1,0) or (0,1).] 

Exercises 11-20 refer to the following problem. On a chicken farm there 
are 10 chickens and 32 eggs. In a given time period a chicken can either be 
used to hatch four eggs or to lay six eggs. The farmer wants to use his avail¬ 
able material for two time periods, and then sell all his chickens and eggs. 
(The product of the first period can be used in the second!) How should he 
employ his chickens to maximize his income? 

11. Let us suppose that x chickens hatch and 10 — x lay during the first 

period. Taking into account the availability of chickens and eggs, what 
restrictions must be placed on x ? (We assume that eggs can be preserved, 
and hence not all eggs need be used.) [. Ans . 0 < x < 8.] 

12. How many chickens will there be at the end of the first period, counting 
the original chickens plus the newly hatched ones? How many eggs will there 
be, counting the original ones plus the newly laid ones, less those that were 
hatched? 

13. Let us suppose that in the second period y chickens hatch, and the rest 
lay. Taking into account the availability of chickens and eggs, what re¬ 
strictions must be placed on y ? How many chickens will lay? 

14. Construct the convex polygon determined by the restrictions on x 
and y. [Hint: It has five sides.) 

15. How many eggs and how many chickens will there be at the end of the 

second period? [Ans. 152 + 14a; - lOy; 10 + 4a; + 4 y.] 

16. If eggs sell for 5 cents and chickens for 25 cents, express in terms of x 

and y the farmer’s income. [Ans. 1010 + 170a; + 50 y.] 

17. For what values of x and y is this income largest? How much is the 

maximum possible income? [Ans. x = 8, y = 3; $25.20.] 

18. For the solution in Exercise 17 trace the number of eggs and chickens 
at each stage. What characterizes this solution? 

[Ans. Hatch the eggs as soon as possible.] 




266 


THEORY OF GAMES 


[Chap. VI 


19. If eggs sell for 5 cents and chickens for 35 cents, what values of x and 

y bring in the maximum income? What is it? [Ans. 2, 18; $31.50.] 

20. Trace the number of eggs and chickens, and compare with Exercise 18. 

4. STRICTLY DETERMINED GAMES 

We turn now from linear programming to the theory of games of 
strategy. Ultimately these two theories can be closely connected, but 
superficially they are quite different. We can consider a linear pro¬ 
gramming problem as that of a single person who tries to maximize 
or minimize a function (of two or more variables) defined over a po¬ 
lygonal convex set of values. In game theory we consider situations 
in which there are two (or sometimes more) persons, each of whose 
actions influence, but do not completely determine, the outcome of a 
single event. The objectives of the players in the game are (usually) 
different. Game theory provides a solution to such games, based on 
the principle that each player tries to choose his course of action so 
that, regardless of what his opponent does, the player can assure him¬ 
self of a certain amount. 

Most recreational games such as tick-tack-toe, checkers, backgam¬ 
mon, chess, poker, bridge, and other card games can be viewed as 
games of strategy. On the other hand, gambling games such as dice, 
roulette, etc., are not (as usually formulated) games of strategy, since 
a person playing one of these games is merely “betting against the 
odds.” 

The actual games of strategy mentioned above are too complicated, 
as they stand, to be analyzed completely. We shall instead construct 
simple examples which, although uninteresting from a player’s point 
of view, do illustrate the theory and which are amenable to computa¬ 
tion. 

In this section and the next we shall discuss some simple examples 
of games. The general definition of a matrix game will be given in 
Section 6. 

Example. Consider the following card game: Suppose there are 
two players, call them R and C (the reason for the use of these letters 
will be explained later); player R is given a hand consisting of a red 5 
and a black 5, while player C is given a black 5 and a red 3. The 
game that they are to play is the following: At a given signal the 
players simultaneously expose one of their two cards; if the cards 
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match in color, player R wins the (positive) difference between the 
numbers on the cards; while if the cards do not match in color, player 
C wins the (positive) difference between the numbers on the cards 
played. Obviously the strategical decision that each player must 
make is which of his two cards to play. 

A convenient way of representing the game is by means of the 
matrix shown in Figure 6. (In game theory it is customary to present 
matrices in this “table” form.) The rows represent the possible 
choices of player R, and the columns the possible choices of C; hence 
our use of R and C. The number in position an represents the gain 


bk 5 

Player R 

rd 5 


of R if R chooses row i and C chooses column j. A positive entry is a 
payment from C to R, while a negative “gain” for R is a payment 
from R to C. For example, if R chooses row 1 (plays bk 5) and C 
chooses column 1 (plays bk 5), then R wins the difference of the two 
numbers, which is 0. If R chooses row 1 but C chooses column 2 
(plays rd 3), then C wins the difference of 2 minus 0, which is indi¬ 
cated by the —2 entry in the matrix. The strategic characteristics 
of the game are completely described by the matrix. 

The game shown in Figure 6 is called a matrix game . Any 2X2 
matrix can be considered a two-person matrix game by allowing one 
player to control the rows, the other the columns, and defining the 
payoffs of the game to be the various matrix entries. In Section 6 
we shall see that a matrix of any size can in the same way also be 
considered a matrix game. 

How should the players play the matrix game of Figure 6? Player 
C would like to get the —2 entry in the matrix; however, the only 
way he could get it would be to play the second column of the matrix, 
in which case player R would surely choose the second row and C 
would lose 2 rather than gain 2. On the other hand, if C chooses the 
first column (i.e., plays bk 5), he assures himself that he will break 
even regardless of what R does. It is clear that R has nothing to lose 


Player C 
bk 5 rd 3 


0 

-2 

0 

2 


Figure 6 
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and may possibly gain by choosing the second row, hence he should 
do so. The knowledge that he will do so reinforces C in his choice 
of the first column. The optimal procedure for the players is then: 
R should play rd 5 and C should play bk 5. If they play this way, 
neither player wins from the other, that is, the game is fair. 

A command of the form: “Play rd 5,” or “Play bk 3,” will be called 
a strategy. If player R uses the strategy “Play rd 5” in the game of 
Figure 6 then, regardless of what C does, R assures himself that he 
will get at least a payoff of zero. Similarly, if C uses the strategy 
“Play bk 5,” then, regardless of what R does, C assures himself of 
obtaining a payoff of at most zero , i.e., a loss of at most zero. Since R 
cannot, by his own efforts, assure himself of gaining more than zero, 
and C cannot, by his own efforts, assure himself of losing less than 
zero, and since these two numbers are the same, we call these optimal 
strategies for the game. Also we call zero the value of the game, since 
it is the outcome of the game if each player uses his optimal strategy. 

Definition. We shall say that a 2 X 2 matrix game is strictly de¬ 
termined if the matrix contains an entry, call it v, which is simultane¬ 
ously the minimum of the row in which it occurs and the maximum 
of the column in which it occurs. Optimal strategies for the players 
are then the following: 

For player R: “Play the row that contains v” 

For player C: “Play the column that contains v” 

The value of the game is v. The game is fair if its value is zero. 

In Section 6 it will be shown that the strategies here defined are 
optimal in the sense indicated above, and that v has the property of 
being the best either player can assure for himself. 

The game of Figure 6 is strictly determined, since the 0 entry in 
the lower left-hand corner of the matrix is the minimum of the second 
row and the maximum of the first column of that matrix. Observe 
that the optimal strategies given in the definition above agree with 
those found above. The value of that game, according to the above 
definition, is zero; hence it is fair. 

The solution of a strictly determined game is particularly easy to 
find since each player can calculate the other’s optimal strategy and 
hence know what he will do. Not all 2 X 2 matrix games are so easy 
to solve, as we shall see in the next section. 

In Figure 7 we show three matrix games. The game in Figure 7 (a) 
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0 

1 

2 ^ 

0 


0 

1 

-3 

10 


(a) (b) (c) 

Figure 7 


is strictly determined and fair, and its optimal strategies are for R 
to choose the first row and C to choose the first column. The game in 
Figure 7(b) is strictly determined but not fair, since its value is 2. 
What are its optimal strategies? Finally, the game in Figure 7(c) is 
not strictly determined, and the solution of games such as this one 
will be the subject of the next section. 


EXERCISES 

1. Deter min e which of the games given below are strictly determined 
and which are fair. When the game is strictly determined find optimal 
strategies for each player. 
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[Ans. (a) Strictly determined and fair; R play row 1, C play column 1; 

(b) nonstrictly determined; (e) strictly determined but not fair; R play 
row 1, C play column 1; (j) strictly determined but not fair; both 
players can use any strategy.] 

2. In the example suppose that R is given rd 5 and bk 3, and C is given 
bk 3 and rd 3. Set up the matrix game corresponding to it. Is it strictly 
determined? Is it fair? Find optimal strategies for each player. 

[Ans. Yes. Yes. Both play bk 3.] 

3. Each of the two players shows one or two fingers (simultaneously) 
and C pays to R a sum equal to the total number of fingers shown. Write 
the game matrix. Show that the game is strictly determined, and find the 
value and optimal strategies. 

4. Each of two players shows one or two fingers (simultaneously) and C 
pays to R an amount equal to the total number of fingers shown, while R 
pays to C an amount equal to the product of the numbers of fingers shown. 
Construct the game matrix (the entries will be the net gain of R), and find 
the value and the optimal strategies. 

[Ans. v = 1, R must show one finger, C may show one or two.] 

5. Show that a strictly determined game is fair if and only if there is a 
zero entry such that both entries in its row are nonnegative and both entries 
in its column are nonpositive. 

6. Consider the game 

G = 


(a) Show that G is strictly determined regardless of the value of a. 

(b) Find the value of G. [Ans. 2.] 

(c) Find optimal strategies for each player. 

(d) If a = 1,000,000, obviously R would like to get it as his payoff. 
Is there any way he can assure himself of obtaining it? What 
would happen to him if he tried to obtain it? 

(e) Show that the value of the game is the most that R can assure 
for himself. 

7. Consider the matrix game 


G = 
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show that G is strictly determined for every set of values for a, c, and d. 
Show that the same result is true if two entries in a given column are always 
equal. 

8. Find necessary and sufficient conditions that the game 


a 

0 

0 

b 


should be strictly determined. (Hint: These will be expressed in terms of 
relations among the numbers a and b and the number zero.) 

9. Suppose that in the example discussed in the text, player R is given a 
hand consisting of bk x and rd y, and player C is given bk u and rd v, where 
x , y, u, and v are real numbers. Suppose that the matrix game which they 
play is the following; 


bk x 

Player R 

rd y 


(a) Show that if x = u } v > x, and y > x, the game is strictly de¬ 
termined and fair. 

(b) Show that if y *= v, y > x, and y < u, the game is strictly deter¬ 
mined and fair. 

10. Consider a strictly determined 2X2 matrix game G . Suppose u and v 
are two entries of the matrix such that each is the minimum of the row and 
the maximum of the column in which it occurs. Show that u = v. 


Player C 
bk it rd v 


x — u 

v — X 

u - y 

y - v 


5. NONSTRICTLY DETERMINED GAMES 

As we saw in the numerical examples of the last section, some ma¬ 
trix games are nonstrictly determined, that is, they have no entry 
which is simultaneously a row minimum and a column maximum. 
We can characterize nonstrictly determined 2X2 matrix games as 
follows: 
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Theorem. The matrix game 


a 

b 

c 

d 


is nonstrictly determined if and only if one of the following two con¬ 
ditions is satisfied: 

(i) a < b, a < c, d < b, and d < c. 

(ii) a > b, a > c, d > b, and d > c. 

(These equations mean that the two entries on one diagonal of the 
matrix must each be greater than each of the two entries on the other 
diagonal.) 

Proof. If either of the conditions (i) or (ii) holds, it is easy to check 
that no entry of the matrix is simultaneously the minimum of the 
row and the maximum of the column in which it occurs; hence the 
game is not strictly determined. 

To prove the other half of the theorem, recall that, by Exercise 7 
of the last section, if two of the entries in the same row or the same 
column of G are equal, the game is strictly determined; hence we can 
assume that no two entries in the same row or the same column are 
equal. 

Suppose now that a < 6; then a < c or else a is a row minimum 
and a column maximum; then also c > d or else c is a row minimum 
and a column maximum; then also d < b or else d is a row minimum 
and a column maximum. Hence the assumption a < b leads to case 
(i) above. 

In a similar manner the assumption a > b leads to case (ii). This 
completes the proof of the theorem. 

Example 1. Consider the card game of the example in the last 
section and assume that player R has bk 5 and rd 3 while player C 
has bk 3 and rd 5. The rules of play are as before. The corresponding 
matrix game is 
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Player C 


bk 5 

Player It 

rd 3 


which clearly is nonstrictly determined. 

Example 2. Consider again Chapter III, Section 2, Example 1. 
Recall that Jones conceals either a SI or a $2 bill in his hand; Smith 
guesses 1 or 2, and wins the bill if he guesses its number. The matrix 
of this game is 


SI bill 

Jones chooses 

$2 bill 


Smith guesses 
3 2 


-1 

0 

0 

-2 


bk 3 rd 5 


2 

0 

o 

2 


Again the game is nonstrictly determined. 

How should one play a nonstrictly determined game? We must 
first convince ourselves that no one choice is clearly optimal for 
either player. In Example 1, R would like to win 2. But if he defi¬ 
nitely chooses bk 5, and C finds this out, C can bring about a zero 
by playing rd 5. If R chooses rd 3, C can bring about a zero by 
playing bk 3. Similarly, if C’s choice is found out by R, then R can 
win 2. So our first result is that each player must, in some way, 
prevent the other player from finding out which card he is going to 
play. 

We also note that for a single play of the game there is no difference 
between the two strategies, as long as one’s strategy is not guessed 
by the opponent. Let us now consider the game being played several 
times. What should R do? Clearly, he should not play the same card 
all the time, or C will be able to notice what R is doing, and profit 
by it. Rather, R should sometimes play one card, and sometimes the 
other! Our key question then is, “How often should R play each of 
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his cards?” From the symmetry of the problem we can guess that he 
should play each card as often as the other, hence each one-half the 
time. (We will see later that this is, indeed, optimal.) In what order 
should he do this? For example, should he alternate bk 5 and rd 3? 
That is dangerous, because if C notices the pattern, he will gain by 
knowing just what R will do next. Thus we see that R should play 
bk 5 half the time, but according to some unguessable pattern. The 
only safe way of doing this is to play it half the time at random. He 
could, for example, toss a coin (without letting C see it) and play 
bk 5 if it comes up heads, rd 3 if it comes up tails. Then his opponent 
cannot guess his decision, since he himself won’t know what the deci¬ 
sion is. Thus we conclude that a rational way of playing is for each 
player to mix his strategies, selecting sometimes one, sometimes the 
other; and these strategies should be selected at random, according 
to certain fixed ratios (probabilities) of selecting each. 

By a mixed strategy for player R we shall mean a command of the 
form, “Play row 1 with probability p x and play row 2 with probability 
p 2 ” where we assume that pi > 0 and p 2 > 0 and pi + p 2 = 1. Simi¬ 
larly, a mixed strategy for player C is a command of the form, “Play 
column 1 with probability q x and play column 2 with probability q 2 ,” 
where q x > 0, q 2 > 0, and q x + q 2 = 1. A mixed strategy vector for 
player R is the probability row vector (px,p 2 ), and a mixed strategy 

vector for player C is the probability column vector (^j. 


Examples of mixed strategies are (!,§) and The reader may 

wonder how a player could actually play one of these strategies. The 

mixed strategy (J,£) is easy to realize 
since it is simply the coin-flipping 
strategy described above. The mixed 



strategy 


is more difficult to real¬ 


Figure 8 


ize since there is no chance device in 
common use that gives these proba¬ 
bilities. However, suppose that a 
pointer is constructed with a card that 
is i shaded and i unshaded, as in Fig¬ 
ure 8, and C simply spins the pointer 
(without letting R see it, of course!). 
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Then, if the pointer stops on the unshaded part he plays the first 
column, and if it stops on the shaded part, he plays the second column, 
and thus realizes the desired strategy. By varying the proportion of 
shaded area on the card other mixed strategies can conveniently be 
realized. 

Consider the nonstrictly determined game 


a 

b 

c 

d 


Having argued, as above, that the players should use mixed strategies 
in playing a nonstrictly determined game, it is still necessary to decide 
how to choose an optimal mixed strategy. 


Definition. For the nonstrictly determined game G the number v 
is its value and p° = (p?,p§) and q° = (q° ly ql) are optimal strategies 
for R and C, respectively, if the following inequalities are satisfied: 



(If z and w are vectors, the inequality z > w means that each com¬ 
ponent of z is greater than or equal to the corresponding component 
of w.) The game is fair if v = 0. 

If R chooses a mixed strategy p = (pi,P 2 ) and (independently) C 

chooses a mixed strategy q = then player R obtains the payoff 

a with probability piqi; he obtains the payoff b with probability piq 2 ; 
he obtains c with probability p^qi) and he obtains d with probability 
p 2 q 2 ; hence his mathematical expectation (see Chapter IV, Section 
12) is then given by the expression 

apiqi + bpxq 2 + cp 2 qi + dp 2 q 2 = pGq. 

By a similar computation, one can show that player C’s expectation 
is the negative of this expression. 
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To justify this definition we must show that if v, p°, q° exist for 
G , each player can guarantee himself an expectation of v. Let q 
be any strategy for C. Multiplying (1) on the right by q , we get 
p°Gq > (v,v)q = v, which shows that, regardless of how C plays, R 
can assure himself of an expectation of at least v. Similarly, let p be 
any strategy vector for R. Multiplying (2) on the left by p, we obtain 

pGq° < p(^ = v > which shows that, regardless of how R plays, C can 

assure himself of an expectation of at most v. It is in this sense that 
p° and q° are optimal. It follows further that, if both players play 
optimally, then R’s expectation is exactly v and C’s expectation is 
exactly v. (Compare Exercise 11.) Hence we call v the (expected) 
value of the game. 

We must now see whether there are strategies p° and q° for the game 
G. While in more complicated games the finding of optimal strategies 
is a difficult task, for a 2 X 2 nonstrictly determined game the follow¬ 
ing formulas provide the solution. 


(3) 

(4) 

(5) 

( 6 ) 
(7) 


o _ _ d _ c _ 

Pl ~a + d- b- c 

0 a ~ ft 

a + d — b — c 


a + d — b — c 


<£ 

v 


a — c 

a + d — b — c 
ad — be 
a + d — b — c 


It is an easy matter to verify (see Exercise 12) that formulas (3)-(7) 
satisfy conditions (l)-(2). Actually, the inequalities in (1) and (2) 
become equalities in this simple case, a fact that is not true in general 
for nonstrictly determined games of larger size. 

The denominator in each formula is the difference between the 
sums of the entries on the two diagonals. Since, for a nonstrictly de¬ 
termined game, the entries on one diagonal must be larger than those 
on the other, the denominator cannot be zero. The reader will recog¬ 
nize the numerator of v as the determinant. 


a b 
c d 


= ad — be. 



See. 5] 


THEORY OF GAMES 


277 


Let us use these formulas to solve the examples mentioned earlier. 


Example 1 (continued). The solution is easily found by substitut¬ 
ing into the above formulas. We obtain (£,§) as the optimal strategy 

for R and as the optimal strategy for C. Hence each player 

should use the coin-flipping strategy for optimal results. The value 
of the game is plus 1, which means that it is biased in R’s favor, and 
R has an expected gain of 1 per game. 

Example 2 (continued). Substitution into the formulas gives 

(f,|) as the optimal strategy for R and as the optimal strategy for 

C. The value of the game is — §, which means that the game is biased 
in Smith’s favor. Smith should then pay 66§ cents to play the game, 
which answers the question raised in Chapter III. 


EXERCISES 

1. Find the optimal strategies for each player and the values of the 
following games: 
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2. Set up the ordinary game of matching pennies as a matrix game. 
Find its value and optimal strategies. How are the optimal strategies realized 
in practice by players of this game? 

3. A version of two-finger Morra is played as follows: Each player holds 
up either one or two fingers; if the sum of the number of fingers shown is 
even, player R gets the sum, and if the sum is odd, player C gets it. 

(a) Show that the game matrix is 


1 

Player R 

2 


(b) Find optimal strategies for each player and the value of the game. 

[Ans. (tV, A); = —A.] 

4. Rework Exercise 3 if player C gets the even sum and player R gets 
the odd sum. 

5. Consider the following “war” problem: Some attacking bombers are 
attempting to bomb a city that is protected by fighters. The bombers can 
each day attack either “high” or “low,” the low attack making the bombing 
more accurate. Similarly, the fighters can each day look for the bombers 
either “high” or “low.” Credit the bombers with six points if they avoid 
the fighters, and zero if the fighters find them. Also credit the bombers with 
three extra points for accurate bombing if they fly low. 

(a) Set up the game matrix. 

(b) Find optimal strategies for each player. 

(c) Give instructions to the bomber and fighter commanders so that 
by flipping coins they can decide what to do. 

[ Arts . (c) The bomber commander should flip one coin to decide 

whether to go high or low. The fighter commander should flip two 

coins and then go high if both turn up heads.] 

6. Generalize the problem in Exercise 5 by crediting the bombers with x 
points for avoiding the fighters and y points for flying low. (Assume that x 
and y are positive.) 

(a) Set up the matrix. 

(b) If y > x show that the game is strictly determined, and find 
optimal strategies. 

(c) If y < x show that the game is nonstrictly determined and find 
optimal strategies. 


Player C 

1 2 

2 -3 

-3 4 
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(d) Comment on these results, with special attention to the bombers' 
strategies. 


7. If G = 


and only if 


a 

b 

c 

d 


is nonstrictly determined, prove that it is fair if 


a b 
c d 


ad — be = 0. 


8. In formulas (3)-(7) prove that p x > 0, p 2 > 0, q x > 0, and q 2 > 0. 
Must v be greater than zero? 

9. Utilizing the results of Exercise 7 of the last section, find necessary 
and sufficient conditions that the game 


a 

0 

0 

b 


be nonstrictly determined. Find optimal strategies for each player and the 
value of G, if it is nonstrictly determined. 

[Arts, a and b must be both positive or both negative. p x = b/(a + b ); 

p 2 = a/{a + b) 7 qi = b/(a + b); q 2 = a/{a + b); v = 

a -f- o 

10 . Suppose that It is given bk x and rd y while C is given bk u and rd v 
(where x , y , u, and v stand for positive integers). Let them play the matrix 
game 


bk x 


rd y 


bk u rd v 


XU 

—xv 

-yu 

yv 


Show that the game is always nonstrictly determined, and always fair. 

11. If G, p°, q°, and v are as in the definition, show that v = p°Gq Q . 

12. Verify that (3)-(7) satisfy the conditions (1) and (2). 


6. MATRIX GAMES 

We shall consider a large class of games in this section, and discuss 
them in considerable generality. Our games are played between two 
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players, according to strictly specified rules. Each player performs 
certain actions, as specified by the rules of the game, and then, at the 
end of the play of the game, one of the players may have to pay a 
sum of money to the other player. The game may be repeated many 
times. 

During such a game a player may have to make many strategic 
decisions. By a (pure) strategy for one of the players we mean a com¬ 
plete set of rules as to how he should make his decisions. We shall 
illustrate this in terms of the game of tick-tack-toe (and nearly the 
same remarks would apply to any game in which the players take 
turns moving). Let us construct a strategy for the player who moves 
first. His first decision concerns the opening move. He may choose 
any one of nine squares, and the strategy must tell him which choice 
to make. Let us say we tell him to move into the upper left-hand 
corner. His opponent may answer this in one of eight ways, and the 
strategy must be prepared for each alternative. It must have eight 
rules, such as “If he moves into the middle, move into the lower 
right-hand corner!” For every such move the opponent may respond 
with one of several alternatives, and the strategy must again have an 
answering move ready for each of them, etc. Hence the strategy takes 
into account every conceivable position of the first player, and in¬ 
structs what move to make in each one. 

A strategy may be thought of as a set of instructions to be given 
to a machine, so that the machine will play the game exactly the way 
we would have. 

We number the strategies of the first player 1, 2,. . . , m, and those 
of the second player 1, 2, . . . , n. Since each of the players must play 
according to one of his strategies, the game may proceed in any one 
of mn ways, and if each player chooses a definite strategy, the out¬ 
come is determined. We may think of giving the two strategies to 
two machines, and let them work out what happens. Let us suppose 
that, when the first player chooses strategy i and the second strategy 
j 9 the former wins an amount a^-. We arrange these numbers a# into 
an m X u matrix, the game matrix . We may then think of the game 
as consisting of a choice of a row by the first player, and a column by 
the second player. Hence we see that any game specified by rules 
may be thought of as a matrix game. 

Conversely, every matrix can be considered as a game. An m X u 
matrix may be thought of as a game between two players, in which 
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player R chooses one of the rn rows and player C simultaneously 
chooses one of the n columns. The outcome of the game is that C 
pays to R an amount equal to the entry of the matrix in the chosen 
row and column. (A negative entry represents a payment from R to 
C, as usual.) 

In an m X n matrix game, the player R has m pure strategies, and 
the player C has n. We have seen in the last section that, in addition, 
we must consider the mixed strategies of the two players. We extend 
this concept to m X n games. 


Definition. An ra-component row vector p is a mixed-strategy vec¬ 
tor for R if it is a probability vector; similarly, an n-component column 
vector q is a mixed-strategy vector for C if it is a probability vector. 
(Recall from Chapter V that a probability vector is one with non¬ 
negative entries whose sum is 1.) Let V and V' be the vectors 


V = (v,v, . . ., v) and V = 

^ -Y-' 

m components 


AM 

V 


VJ 


n components 


where v is a number. Then v is the value of the game and p° and q° are 
optimal strategies for the players if and only if the following inequali¬ 
ties hold: 

V°G > V y 
Gq Q < V'. 


In Sections 4 and 5 we have given several examples of such matrix 
games together with their solutions. Notice that we have not proved 
that an arbitrary matrix game has a value and optimal strategies for 
each player; that question will be discussed in the next section. 

Theorem. If G is a matrix game which has a value and optimal 
strategies, then the value of the game is unique. 

Proof. Suppose that v and w are two different values for the game 
G. Let V = (v,v, and W = (w,w, . . . , w) be m-component 

row vectors, and let 
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M 


(:\ 


V' = 


V 


and W' = 


W 


be n -component column vectors. Then let p° and g° be optimal mixed 
strategy vectors associated with the value v such that 


(a) p°G > V, 

(b) Gq° < V'. 

Similarly, let p 1 and q 1 be optimal mixed strategy vectors associated 
with the value w such that 


(c) fO > W, 

(d) Gq 1 < W r . 


If we now multiply (a) on the right by q 1 , we get jfiGq 1 > Vq 1 = v. 
In the same way, multiplying (d) on the left by p° gives p°Gq l < w. 
The two inequalities just obtained show that w > v. 

Next we multiply (b) on the left by p 1 and (c) on the right by q°, 
obtaining v > p x Gq Q and p l Gq® > w, which together imply that v > w. 

Finally we see that v < w and v > w imply together that v — w, 
that is, the value of the game is unique. 

Theorem. If G is a matrix game with value v and optimal strate¬ 
gies p° and q°, then v = p°Gq°. 

Proof. By definition v, p°, and q° satisfy 

p°G > V and Gq° < V 

Multiplying the first of these inequalities on the right by q°, we get 
p°Gq° > v. Similarly, multiplying the second inequality on the left by 
p °, we obtain p°Gq° < v. These two inequalities together imply that 
v = p°Gq°, concluding the proof. 


The theorem just proved is important because it permits us to give 
an interpretation of the value of a game as an expected value in the 
sense of probability (see Chapter IV, Section 12). Briefly the inter¬ 
pretation is the following: If the game G is played repeatedly and if 
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each time it is played player R uses the mixed strategy p° and player 
C uses the mixed strategy g°, then the value v of G is the expected 
value of the game for R. The law of large numbers implies that, if 
the number of plays of G is sufficiently large, then the average value 
of R’s winnings will (with high probability) be arbitrarily close to the 
value v of the game G . 

As an example, let G be the matrix of the game of matching pen¬ 
nies, i.e., 


1 

-1 

-1 

1 


As was found in Exercise 2 of the last section, optimal strategies in 
this game are for R to choose each row with probability f and for C 
to choose each column with probability J. The value of G is zero. 
Notice that the only two payoffs that result from a single play of the 
game are +1 and —1, neither of which is equal to the value (zero) 
of the game. However, if the game is played repeatedly, the average 
value of R’s payoffs will approach zero, which is the value of the game. 

Theorem. If G is a game with value v and optimal strategies 
p° and q°, then v is the largest expectation that R can assure for him¬ 
self. Similarly, v is the smallest expectation that C can assure for 
himself. 

Proof. Let p be any mixed strategy vector of R and let q° be an 
optimal strategy for C; then multiply the equation Gq° < V f on the 
left by p, obtaining pGq° < v. The latter equation shows that, if C 
plays optimally, the most that R can assure for himself is v. Now let 
p° be optimal for R; then, for every q> p°Gq > v y so that R can actually 
assure himself of an expectation of v. The proof of the other statement 
of the theorem is similar. 

The above theorem gives an intuitive justification to the definition 
of value and optimal strategies for a game. Thus the value is the 
“best” that a player can do and optimal strategies are the means of 
achieving this “best.” 

Definition. A matrix game G is strictly determined if there is an 
entry g *•/ in G that is the minimum entry in the fth row and the maxi- 
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mum entry in the jth column. (By rearranging and renumbering the 
rows and columns of a strictly determined matrix game G we can 
assume that g u is an entry that is the minimum of row 1 and the 
maximum of column 1.) 


Theorem. If G is a strictly determined matrix game, arranged as 
indicated in the definition, the value of the game is v = gn. Moreover, 
optimal strategies for the players are 


p° = (1,0,0, . . . , 0) and q° 




0 

0 


w 


(These optimal strategies simply say that R should choose the row 
that contains the entry g n (the first row) and C should choose the 
column that contains the entry gn (the first column). Compare these 
optimal strategies with those found in Section 4 for strictly deter¬ 
mined 2X2 games.) 


Proof. Suppose that G is strictly determined and the rows and 
columns of G are so arranged and numbered that g n is an entry of G 
that is the minimum of row 1 and the maximum of column 1. Then 
we set v ~ gn and let p° and q° be the strategies as defined in the state¬ 
ment of the theorem. We have 


V°G = (gn,gn, • • •, gm) 

> (!7n,<7n, • • • , <7n) = V 

where we have used the fact that g n was the minimum of the first 
row. Similarly, using the fact that gn is the maximum of the first 
column, we have 
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From these two inequalities and the definition of a matrix game given 
above, we conclude that v is the value of the game and p° and q° are 
optimal strategies. 

Theorem. If g n and g {j are two entries of G that are the minima 
of the rows and the maxima of the columns in which they occur, then 
v — 0n = gu = g* = g^. 

Proof. Using the facts that gn and are the minima of the rows 
and the maxima of the columns in which they occur we see that 

g%j ^ gu ^ giiy g%j ^ gn ^ ^n* 

(These inequalities are redundant but still true if either i — 1 or 
j = 1.) These two sets of inequalities imply that = gu = gn = 
gu = v, completing the proof of the theorem. 

Example 1. Although we have proved that the value of a game 
is unique, it may happen that a game has more than one pair of opti¬ 
mal strategies. For example, let G be the game 


1 

5 

1 

7 

-2 

8 

0 

-9 

1 

12 

1 

3 


Then we see that G is strictly determined with value 1, and the 
optimal strategies are (1,0,0) and (0,0,1) for player R and 


/i\ 


/0\ 


and 

0 

n 

0 


1 

w 

I 

w 


for player C. In the next theorem we shall see that there are still 
other optimal strategies for this game. 


Theorem. If p° and p 1 are two optimal strategies for R in a matrix 
G then the strategy 

p = ap° + (1 - a)p\ 
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where a is any number satisfying 0 < a < 1, is also an optimal strat¬ 
egy for R. 

Similarly, if g° and q fl are optimal strategies for C in G f then the 
strategy 

q = aq° + (1 - a)q\ 

where a is any number satisfying 0 < a < 1, is also an optimal strat¬ 
egy for C. 

Proof. We shall prove the first statement only and leave the second 
as an exercise (see Exercise 3). It is easy to show that p is a prob¬ 
ability vector. By hypothesis we have p°G > V and p x G > V . Hence 
we see that 

pG = [ap° + (1 — a)?? 1 ]*? 

= ap°G + (1 — a)p l G 
> aV + (1 - a)V = V 

which shows that p is also an optimal strategy, completing the proof 
of the theorem. 

This theorem implies that, in Example 1, the strategies of the form 
a(l,0,0) + (1 - a)(0,0,1) = (a,0,1-a) are optimal for R. It is easy 
to check that (|,0,J) and (i,0,f) are optimal and of this form. 


EXERCISES 


1. Find the value and all optimal strategies for the following games. 



[Ans. v — 5; (0,1,0); 



0 

5 

6 

-3 

1 

-1 

2 

3 

1 

2 

3 

4 

-1 

0 

7 

5 
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2. Find the value of and all optimal strategies for the following games. 



3. If q° and q 1 are optimal strategies for C in the matrix game G, show 
that the strategy 

q = aq° + (1 - a)q\ 

where a is a constant with 0 < a < 1, is also optimal in the game G. 

4. Verify that the strategies p° = and 



are optimal in the game G whose matrix is 



What is the value of the game? 

5. Generalize the result of Exercise 4 to the game G whose matrix is the 
n X n identity matrix. 

6. Suppose that player It tries to find C in one of three towns X,Y, and Z. 
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The distance between X and Y is five miles, the distance between Y and Z 
is five miles, and the distance between Z and X is ten miles. Assume that 
R and C can go to one and only one of the three towns and that if they both 
go to the same town, R “catches” C and otherwise C “escapes.” Credit R 
with ten points if he catches C, and credit C with a number of points equal 
to the distance he is away from R if he escapes. 

(a) Set up the game matrix. 

(b) Show that both players have the same optimal strategy, namely, 
to go to towns X and Z with equal probabilities and to go to town 
Y with probability 

(c) Rind the value of the game. 

7. A version of five-finger Morra is played as follows: Each player shows 
from one to five fingers, and the sum is divided by three. If the sum is 
exactly divisible by three, there is no exchange of payoffs. If there is a re¬ 
mainder of one, player R wins a sum equal to the total number of fingers, 
while if the remainder is two, player C wins the sum. 

(a) Set up the game matrix. {Hint: It is 5 X 5.) 

(b) Verify that an optimal strategy for either player is to show one or 
five fingers with probability i, to show two or four fingers with 
probability f, and to show three fingers with probability 

(c) Is the game fair? [ Ans . Yes.] 

8. Consider the following game: 


a 

0 

0 

0 

b 

0 

0 

0 

c 


(a) If a, b, and c are not all of the same sign, show that the game is 
strictly determined with value zero. 

(b) If a, b, and c are all of the same sign, show that the vector 

_ be _ _ ca _ _ ab _ 

ab + be + ca ab + be + ca ab + be + ca 

is an optimal strategy for player R. 

(c) Find player C’s optimal strategy for case (b). 

(d) Find the value of the game for case (b) and show that it is positive 
if a, b, and c are all positive, and negative if they are all negative. 

9. Two players agree to play the following game. The first player will 
show 1, 2, or 4 fingers. The second player will show 2, 3, or 5 fingers, simul- 
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taneously. If the sum of the fingers shown is 3, 5, or 9, the first player re¬ 
ceives this sum. Otherwise no payment is made. 

(a) Set up the game matrix. 

(b) Use the results of Exercise 8 to solve the game. 

(c) How much should the first player be willing to pay to play the 

game? [ Ans . M-] 

10 . Consider the (symmetric) game whose matrix is 


0 

—a 

-b 

a 

0 

—c 

b 

c 

0 


(a) If a and b are both positive or both negative, show that G is 
strictly determined. 

(b) If b and c are both positive or both negative, show that G is 
strictly determined. 

(c) If a > 0, b < 0, and c > 0, show that an optimal strategy for 
player R is given by 

c —b a 

a — 6 + c a — 6 + c a — b + c 

(d) In part (c) find an optimal strategy for player C. 

(e) If a < 0, b > 0, and c < 0 show that the strategy given in (c) is 
optimal for R. What is an optimal strategy for player C? 

(f) Prove that the value of the game is always zero. 

11 . In a well-known children’s game each player says “stone” or “scissors” 
or “paper.” If one says “stone” and the other “scissors,” then the former 
wins a penny. Similarly, “scissors” beats “paper,” and “paper” beats 
“stone.” If the two players name the same item, then the game is a tie. 

(a) Set up the game matrix. 

(b) Use the results of Exercise 10 to solve the game. 

12. In Exercise 11 let us suppose that the payments are different in 
different cases. Suppose that when “stone breaks scissors,” the payment is 
one cent; when “scissors cut paper,” the payment is two cents; and when 
“paper covers stone,” the payment is three cents. 

(a) Set up the game matrix. 

(b) Use the results of Exercise 10 to solve the game. 

[Ans. i “stone,” J “scissors,” J “paper”; v = 0.] 
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7. MORE ON MATRIX GAMES: THE FUNDAMENTAL THEOREM 

Here we continue the discussion of the basic properties of matrix 
games. First we show what happens to the game if each entry in the 
matrix is multiplied by a nonnegative constant or if the same con¬ 
stant is added to each entry in the matrix. Then we discuss the funda¬ 
mental existence theorem for matrix games. 

Theorem. If k is a nonnegative number, i.e., k > 0, and G is a 
matrix game with value v, then the game kG is a matrix game with 
value kv , and every strategy optimal in G is also optimal in kG . 
(Recall that the matrix kG is obtained from G by multiplying every 
entry of G by the number k .) 

Proof . Let p° be an optimal strategy for R in the game G , that is, 
p°G > V. Then we have 

V \kG) = k(p°G) > kV. 

Similarly, if q° is optimal for C in the game (?, then 
(kG)q° = k(Gq°) < kV'. 

These two inequalities show that kv is the value of kG and also that 
optimal strategies in G are also optimal in the game kG . 

It should be observed that it was essential for the proof of this 
theorem that k be nonnegative, since multiplying an inequality by a 
negative number has the effect of reversing the direction of the in¬ 
equality sign. The following example shows that the above theorem 
is false for negative k’s. 

Example 1. Let k = — 1 and let G and (—!)(? be the matrices 


2 


-1 

0 


-2 

CO 

1 

1 

0 


Observe that each of these games is strictly determined but that the 
value of the first game is 2, while the value of the second is 0 (which 
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is not equal to (-1)2 = -2). Moreover, optimal strategies in G are 
for R to play the first row with probability 1, and for C to play the 
first column with probability 1, but neither of these strategies is 
optimal in the game (—l)(r. 

Theorem. Let G be an to X n matrix game with value v; let E 
be the m X n matrix each of whose entries is 1; and let k be any 
constant. Then the game G + kE has value v + k, and every strategy 
optimal in the game G is also optimal in the game G + kE. (The 
game G + kE is obtained from the game G by adding the number k 
to each entry in G.) 

Proof. Let p° and <f be optimal strategies in G) then p°G > V and 
Gq° < V ■ We have 

p°(G + kE) = p°G + pHkE) 

= p°G + k(p°E) 

> (v,v, + (k,k, . . . ,k) 

= (v+k,v+k, . . ., v+k). 

Similarly, we have 

((? + kE)(f = Gq° + k(E<f) 



fv\ 


/k\ 


(v + k\ 


V 


k 


v + k 

< 

% ! 

+ 


= 



W 


w 


\v + k) 


These inequalities show that the value of the game G + kE is v + k 
and also show that each strategy optimal in G is optimal in G + kE. 

Matrix game theory would not be of very great interest unless we 
knew under what conditions such a game has a solution. The funda¬ 
mental theorem of game theory is that every matrix game has a solu¬ 
tion. The proof of this theorem is too difficult to be included here, 
but we do discuss its proof for the 2X2 case. 

Fundamental theorem. Let G be any m X n matrix game; then 
there exists a value v for G and optimal strategies p° for player R and 
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q° for player C. In other words, every matrix game possesses a solu¬ 
tion. 

Proof for 2X2 matrices. If G is strictly determined, the value and 
optimal strategies were found in Section 4. If G is not strictly deter¬ 
mined, formulas (3) through (7) of Section 5 give the optimal strate¬ 
gies and value for G. Since G must be either strictly determined or 
nonstrictly determined we have covered all cases. 

EXERCISES 

1. Find the values of the games kG and G + kE for each of the games G 
whose matrices are given in Exercise 1 of Section 6, if k takes on the values 
3, 0, and —2. 

2. If G is any matrix game and k = 0 find all optimal strategies for each 

player in the game kG. [Arts. Any strategy is optimal.] 

3. If G is any matrix game and k > 0, show that every strategy optimal 
in kG is also optimal in G. {Hint: Multiply by 1 /k.) 

4. If G is any matrix game and k is any constant, show that every strategy 
optimal in the game G + kE is also optimal in the game G . 

5. Suppose that before C and R play a matrix game G, player C gives 
to player R a payment of k dollars. In this case we shall say C has made a 
side payment of k to R. (If k is negative, then, as usual, this will be a side 
payment of R to C.) 

(a) If C has made a side payment of k to R before playing the game G, 
show that the game they actually play is G + kE. 

(b) If v is the value of the game G , find the value of the game G — vE. 

(c) Using the results of (a) and (b), show that any matrix game G 
with value v can be made into a fair matrix game by requiring 
that C make a side payment of — v to R before they play the 
game G. 

6. Show that any matrix game G can be made into a fair matrix game, 
with each entry in the matrix lying between —1 and 1 , by adding the same 
number to each entry in the matrix and by multiplying each entry by a 
positive number. 

7. Show that the sets of optimal strategies for each player are unchanged 
by the transformation suggested in Exercise 6. How does the value of the 
game change? 
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8. Consider the matrix game: 


a 

b 

b 

b 

a 

b 

b 

b 

a 


where a > b. 


(a) Show that this can be obtained from the identity matrix by multi¬ 
plying it by a suitable number, and then adding bE . 

(b) Use the results of Section 6, Exercise 4, to solve the game. 

[Ans. v = a/3 + 26/3.] 


9. Suppose that the entries of a matrix game are rewritten in new units 
(e.g., dollars instead of cents). Show that the monetary value of the game 
has not changed. 

10. Consider the game of matching pennies whose matrix is 


1 

-1 

-1 

1 


If the entries of the matrix represent gains or losses of one penny, would you 
be willing to play the game at least once? If the entries represent gains or 
losses of one dollar would you be willing to play the game at least once? 
If they represent gains or losses of one million dollars would you play the 
game at least once? In each of these cases show that the value is zero and 
optimal strategies are the same. Discuss the practical application of the 
theory of games in the light of this example. 


8. 2 X n AND m X 2 MATRIX GAMES 

After the 2X2 games, the simplest matrix games are the 2 X n 
and m X 2 games, i.e., where one player has only two strategies. 
Here we discuss the solution of such games. 

Example 1. Suppose that Jones conceals one of the following 4 
bills in his hand: a $1 or a $2 United States bill or a $1 or a $2 Cana¬ 
dian bill. Smith guesses either “United States ’ 1 or Canadian and 
gets the bill if his guess is correct. The matrix of the game is the 
following: 
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Smith Guesses 


Jones 

Chooses 


U.S. 


Can. 



U.S. 

Can. 

$1 

- 1 

0 

$2 

-2 

0 

$1 

0 

-1 

$2 

0 

—2 


It is obvious that Jones should always choose the $1 bill of either 
country rather than the $2 bill, since by doing so he may cut his losses 
and will never increase them. This can be observed in the matrix 
above, since every entry in the second row is less than or equal to the 
corresponding entry in the first row, and every entry in the fourth 
row is less than or equal to the corresponding entry in the third row. 
In effect we can eliminate the second and fourth rows and reduce the 
game to the following 2X2 matrix game: 


Smith Guesses 


Jones 

Chooses 


U.S. Can. 

U.S. $1 
Can. $1 


-1 

0 

0 

-1 


The new matrix game is nonstrictly determined with optimal strate¬ 


gies (i,i) for Jones and ylj for Smith. The value of the game is — 
which means that Smith should pay 50 cents to play it. 


Definition. Let A be an m X n matrix game. We shall say that 
row i majorizes row h if every entry in row i is as large as or larger than 
the corresponding entry in row h. Similarly, we shall say that column 
j minorizes column k if every entry in column j is as small as or smaller 
than the corresponding entry in column h . 

Any majorized row or minorized column can be omitted from the 
matrix game without affecting its solution. In the original matrix of 
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Example 1 above, we see that row 1 majorizes row 2, and also that 
row 3 majorizes row 4. 

Example 2. Consider again the card game of Section 4, this time 
giving R a bk 5 and rd 3, while C receives a bk 6 and a bk 5 and a 
rd 4 and a rd 5. The matrix of the game is 



bk 6 

bk 5 

rd 4 

rd 5 

bk 5 

1 

0 

-1 

i 

0 

rd 3 

-3 

-2 

1 

2 


Observe that column 3 minorizes column 4; that is, C should never 
play rd 5. Thus our game can be reduced to the following 2X3 game: 


bk 6 bk 5 rd 4 


1 

0 

-1 

CO 

1 

1 

to 

1 


No further rows or columns can be omitted; hence we must introduce 
a new technique for the solution of this game. It can be shown 
(though we will not attempt to do so) that, in order to solve a 2 X n 
or m X 2 game, it is sufficient to look at a number of 2 X 2 games. 
These are obtained by striking out columns (or rows) of the original 
game, till it is reduced to a 2 X 2 game; these are derived games of the 
original game. It can then be shown that the optimal strategy of each 
player is optimal in one of the derived games, and that the value of 
the game is the value of one of the derived games. Hence we need 
only solve all the 2 X 2 derived games, and try out each strategy of 
each player, and each value, to see which are the optimal strategies 
and the value of the whole game. 

In the above 2X3 game we have three derived games: 


bk 6 bk 5 bk 6 rd 4 bk 5 rd 4 


bk 5 

i ’ 

° 1 

bk 5 

1 

-1 

bk 5 

0 

-1 

rd 3 

-3 

-2 

rd 3 

—3 

1 

rd 3 

-2 

1 
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The first game is strictly determined, the others are not. The optimal 
strategies of player R in each of these derived games are: (1,0), 
and respectively. The optimal strategies for C are 



respectively. The values of the games are 0, — If we apply 

the three R-strategies to the original game we note that with the first 
strategy R may lose as much as 1, with the second he may lose §, 
while with the third he cannot lose more than Hence (f,J) is opti¬ 
mal for him, and — J is the value of the game. In order to apply the 
C-strategies from a 2 X 2 derived game to the whole game, we must 
first extend them to four-component strategies. They then become 


M 

i 

0 

V 



w 



It is easy to verify the last one is optimal for C. 


The solution of another 2 X n or m X 2 matrix game should be 
carried out in a similar manner. First check to see whether or not 
the game is strictly determined. If it is not strictly determined, elimi¬ 
nate all majorized rows and minorized columns. Then solve all pos¬ 
sible 2X2 derived games obtained by striking out one or more rows 
or columns. The value of the original game will be found as a value 
of one of these 2X2 games, and the optimal strategies of the original 
game will be found among the optimal strategies of the derived games 
(which may have to be extended by the addition of zeros). 


Example 3. A numerical example of a 3 X 2 game is 


6 

-1 

0 

2 

4 

CO 
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Here the game is strictly determined, since the entry 3 is the minimum 
of its row and the maximum of its column. The value of the game is 3, 


and optimal strategies are p° = (0,0,1) and q° — I ^ 


Example 4. Another numerical example is 


1 

-1 

2 

CO 

l 

-1 

1 

o 

1 


Here the fourth column minorizes the second, and the first column 
minorizes the third. The game is then reduced to 


1 

CO 

1 

-1 

1 



EXERCISES 

1. Solve the following games: 







the following gai 


0 

15 

8 

0 

-10 

20 

10 

12 
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(cl 


-1 

5 

-1 

-2 

8 

10 

3 

—6 

0 

8 

-9 

-8 


[An ans. v = — i; (i,i) ; 


1 1 

T2- 

0 

0 

\0 / 


.] 


3. Solve the game 


1 

2 

3 

3 

2 

1 


Since there is more than one optimal strategy for C, find a range of optimal 
strategies for him. (See Section 6, Exercise 3.) 

4. In the card game of Example 2 suppose that It has bk 9, bk 5, rd 7 and 
rd 3, while C has bk 8 and rd 4. Set up and solve the corresponding matrix 
game. 

[Ans. v = 1; It shows bk 5 and rd 7 each with probability J; C shows 
each of his cards with probability J.] 

5. Suppose that Jones conceals in his hand one, two, three, or four silver 
dollars and Smith guesses “even” or “odd.” If Smith’s guess is correct, he 
wins the amount which Jones holds, otherwise he must pay Jones this amount. 
Set up the corresponding matrix game and find an optimal strategy for each 
player in which he puts positive weight on all his (pure) strategies. Is the 
game fair? 

6. Consider the following game: Player R announces “one” or “two”; 
then, independently of each other, both players write down one of these two 
numbers. If the sum of the three numbers so obtained is odd, C pays R the 
odd sum in dollars; if the sum of the three numbers is even, R pays C the 
even sum in dollars. 

(a) What are the strategies of It? (Hint: He has four strategies.) 

(b) What are the strategies of C? {Hint: We must consider what C 
does after “one” is announced after a “two.” Hence he has four 
strategies.) 

(c) Write down the matrix for the game. 
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(d) Restrict player R to announcing “two,” and allow for C only 
those strategies where his number does not depend on the an¬ 
nounced number. Solve the resulting 2X2 game. 

(e) Extend the above mixed strategies to the original game, and show 
that they are optimal. 

(f) Is the game favorable to R? If so, by how much? 

7. Answer the same questions as in Exercise 6 if R gets the even sum and 
C gets the odd sum (except that, in part (d) restrict R to announce “one”). 
Which game is more favorable for R? Could you have predicted this without 
the use of game theory? 

8. Rework the five-finger Morra game of Section 6, Exercise 7, with the 
following payoffs: If the sum of the number of fingers is even, R gets one, 
while if the sum is odd, C gets one. Suppose that each player shows only one 
or two fingers. Show that the resulting game is like matching pennies. Show 
that the optimal strategies for this game, when extended, are optimal in the 
whole game. 

9. A version of three-finger Morra is played as follows: Each player shows 
from one to three fingers; R always pays C an amount equal to the number 
of fingers that C shows; if C shows exactly one more or two fewer fingers 
than R, then C pays R a positive amount x (where x is independent of the 
number of fingers shown). 

(a) Set up the game matrix for arbitrary x’s. 

(b) If x = J, show that the game is strictly determined. Find the 

value. [Arts, v = — -§-.] 

(c) If x = 2, show that there is a pair of optimal strategies in which 

the first player shows one or two fingers and the second player 
shows two or three fingers. {Hint: Solve a 2 X 2 derived game.) 
Find the value. [Ans. v = —}.] 

(d) If x = 6, show that an optimal strategy for R is to use the mixed 
strategy (4>4>4)* Show that the optimal mixed strategy for C is to 
choose his three strategies each with probability 4. Find the value 
of the game. 

10 . Another version of three-finger Morra goes as follows: Each player 
shows from one to three fingers; if the sum of the number of fingers is even, 
then R gets an amount equal to the number of fingers that C shows: if the 
sum is odd, C gets an amount equal to the number of fingers that R shows. 

(a) Set up the game matrix. 

(b) Reduce the game to a 2 X 2 matrix game. 

(c) Find optimal strategies for each player and show that the game is 
fair. 
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11 . Two companies, one large and one small, manufacturing the same 
product, wish to build a new store in one of four towns located on a given 
highway. If we regard the total population of the four towns as 100 per cent, 
the distribution of population and distances between towns are as shown: 



12 3 4 


Assume that if the large company’s store is nearer a town it will capture 80 
per cent of the business, if both stores are equally distant, then the large 
company will capture 60 per cent of the business, and if the small store is 
nearer, then the large company will capture 40 per cent of the business. 

(a) Set up the matrix of the game. 

(b) Test for majorized rows and minorized columns. 

(c) Find optimal strategies and value for the game and interpret your 
results. 

[. Ans . Both companies should locate in town 2; the large company 

captures 60 per cent of the business.] 

12. Rework Exercise 11 if the per cent of business captured by the large 
company is 90, 75, and 60, respectively. 

13. We have stated without proof that any 2 X n game can be solved by 
considering only its 2 X 2 derived games. Verify that this is the case for a 
game of the form 

C 


a 

0 

1 

0 

b 

1 


(a) Show that if a < 1 or b < 1, then column 3 is minorized. Hence 
solve the game. 

(b) If a > 1 and b > 1, solve the three 2X2 derived games. (Hint: 
Two of them are strictly determined.) 

(c) If a > 1,6 > 1, but ab < a + b, then show that the strategies of 
the nonstrictly determined derived game are optimal for both 
players. 

(d) If ab > a + 6, then show that R has as optimal strategy the same 
strategy as in part (c), but C has a pure strategy as optimal 
strategy. 
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(e) Using the previous results, show that the value of the game is 
always the smallest of the values of the three derived games. 

9. SIMPLIFIED POKER 

In order to illustrate the procedure of translating a game specified 
by rules into a matrix game, we shall carry it out for a simplification 
of a well-known game. The example that we are about to discuss is a 
simplification (by A. W. Tucker) of the poker game discussed on pp. 
211—219 in the book The Theory of Games and Economic Behavior , by 
John von Neumann and Oskar Morgenstern. 

The deck that is used in simplified poker has only two types of 
cards, in equal numbers, which we shall call “high” and “low.” For 
example, an ordinary bridge deck could be used with red cards high 
and black cards low. Each player “antes” an amount a of money and 
is dealt a single card which is his “hand.” By a “deal” we shall mean 
a pair of cards, the first being given to player It and the second to 
player C. Thus the deal (H,H) means that each player obtains a 
high card. There are then four possible deals, namely, 

(H,H), (H,L), (L,H), (L,L). 

Ignoring minor errors (see Exercise 1), if the number of cards in the 
deck is large, each of these deals is “equally likely,” that is, the prob¬ 
ability of getting a specific one of these deals is J. 

After the deal, player R has the first move and has two alternatives, 
namely, to “see,” or to “raise” by adding an amount b to the pot. 
If R elects to see, the higher hand wins the pot or equal hands split 
the pot equally. If R elects to raise, then C has two alternatives, to 
“fold,” or to “call” by adding the amount b to the pot. If C folds, 
player R wins the pot (without revealing his hand). If C calls, then 
the higher hand wins the pot or equal hands split the pot equally. 
These are all the rules. 

A pure strategy for a player is a command that tells him exactly 
what to do in every conceivable situation that can arise in the game. 
An example of a pure strategy for R is the following: “Raise if you 
get a high card, and see if you get a low card.” We can abbreviate 
this strategy to simply raise-see. It is easy to see that R has four 
pure strategies, namely, raise-raise, raise-see, see-raise, and see-see. 
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In the sam e manner, C has four pure strategies, fold-fold, fold-call, 
call-fold, call-call. 

Given a choice of a pure strategy for each player, there are exactly 
four ways the play of the game can proceed, depending on which 
of the four deals occurs. For example, suppose that R has chosen 
the see-raise strategy, and C has chosen the fold-fold strategy. If the 
deal is (H,H), then R sees, and they split the pot, so neither wins; 
if the deal is (H,L), then R sees and wins the pot, giving him a; if the 
deal is (L,H), then R raises and C folds, so that R wins a; and if the 
deal is (L,L), then R raises and C folds, so that R wins a. Since the 
probabilities of each of these deals is the expected value of R’s gain 
is 3a/4. Let us compute another expected value, namely, suppose 
that R uses see-raise and C uses call-fold. Then, if the deal is (H,H), 
R sees and wins nothing; if the deal is (H,L), then R sees and wins a; 
if the deal is (L,H), then R raises, C calls, and C wins a + b; and if 
the deal is (L,L), then R raises, C folds, and R wins a. The expected 
value for R here is (a — b)/ 4. 

Continuing in this manner we can compute the expected outcome 
for each of the 16 possible choice of pairs of strategies. The payoff 
matrix so obtained is given below. 


High 


fold 

fold 

call 

call 


Low 

fold 

call 

fold 

call 

see 

see 

0 

0 

0 

0 



3a 

2 a 

a —b 

- b 

see 

raise 

— 

•— 




4 

4 

4 

4 

raise 


a 

a b 

0 

b 

see 

4 

4 

4 

raise 


4 a 

3 CL -f- b 

a — b 

0 

raise 

4 

4 

4 


The reader should observe that we have just completed the transla¬ 
tion of a game specified by rules into a matrix game. 

Since a and b are positive numbers, we see that, in the matrix above, 
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the fourth row majorizes the second, and the third row majorizes the 
first. Similarly, the third column minorizes the first and second col¬ 
umns. We can reduce the 4X4 matrix to the following 2X2 matrix: 


Conservative Bluffing 


Conservative 


Bluffing 


High 


call 

call 


Low 

fold 

call 

raise 

see 

0 

b 

4 

raise 

raise 

a — b 

4 

0 


Notice that we have labeled the raise-see strategy as “conservative” 
for R, since it seems sensible to raise when he has a high card and to 
see when he has a low one. The strategy raise-raise which says, raise 
even if you have a low card, we have labeled “bluffing,” since it cor¬ 
responds to the ordinary notion of bluffing. In the same manner we 
have labeled the call-fold strategy “conservative,” and the call-call 
strategy “bluffing,” for player C. 

Example 1. Suppose a = 4 and b — 8. Then the matrix becomes 


Conservative 


Bluffing 


Here the game is strictly determined and fair, and optimal strategies 
are for each player to play conservatively. 

Example 2. Suppose a = 8 and 6 = 4. Then the matrix becomes 


Conservative 


Bluffing 


Conservative Bluffing 


0 

1 

1 

0 


Conservative Bluffing 


0 

2 

-1 

0 
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Here the value of the game is meaning that it is biased in favor of 
R. Optimal strategies are for each player to bluff with probability ^ 
and to play conservatively with probability §. 

Here we have one of the most interesting results of game theory, 
since it turns out that, as part of an optimal strategy, one should 
actually bluff part of the time. 

EXERCISES 

1. Suppose that the simplified poker game is played with an ordinary 
bridge deck where red is “high” and black is “low.” Compute to four decimal 
places the conditional probability of drawing a red card, given that one red 
card has already been drawn. From this, discuss the accuracy of the as¬ 
sumption that the four deals are equally likely. How could the accuracy of 
the assumption be improved? 

2. Substitute a = 4 and 6 = 8 into the 4 X 4 matrix above, and reduce 
it by majorizations and minorizations to a 2 X 2 matrix game. Is it the one 
considered in Example 1 above? Do the same for a = 8 and 6 = 4 and 
Example 2. 

3. If a < 6, show that the simplified poker game is strictly determined 
and fair. Show that both players’ optimal strategy is to play conservatively. 

4. If a > 6, show that the simplified poker game is biased in favor of R. 
Show that, to play optimally, each player must bluff with positive probability, 
and find the optimal strategies. 

5. If a > 6, discuss ways of making the game fair. 

6 . When 6 > a, show that the optimal strategy of player R is not unique. 
Show that although he has two “optimal” strategies, the raise-see strategy 
is in a sense better than the other. 

7. Show that in the case a = 8, 6 = 4, the strategy of R can be interpreted 
as follows: “On a high card always raise, on a low card raise with probability 
§.” Reinterpret C’s mixed strategy similarly. 

The remaining exercises concern a variant of the simplified poker game. 
Real poker is characterized by the fact that there are very many poor hands, 
and very few good ones. We can make the above model of poker more realistic 
by making the draw of a low card more probable than that of a high card. 
Let us say that the probability of drawing a high card is only J. The rules 
of the game remain as in the text. 

8. Calculate the probabilities of (H,H), (H,L), (L,H), and (L,L) deals. 

9. The strategies of the two players are as in the text, hence we will get a 
similar 4X4 game matrix. Calculate the see-raise vs. fold-fold entry of the 
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matrix, just as in the text, but using the results of Exercise 8. Do the same 
for the see-raise vs. call-fold entry. [Ans. 24a/25; (16a — 46)/25.] 

10. Fill in the remaining matrix entries. 

11. Show that two rows are majorized, and that two columns are 
minorized. 

12. Show that the resulting 2X2 game is strictly determined if and only 
if b > 4a. What is the value of the game in these cases? 

13. Let a = 4, b = 8, as in the text, and solve the game. Compare your 
solution with that in the text. 

[. Ans. Each player should bluff half the time; v = if; in the previous 
version there was no bluffing in this case, and the game was fair.] 

14. Let a — 8, b — 4, as in the text, and solve the game. Compare your 
solution with that in the text. 

[Ans. Each player plays more conservatively; game is slightly more 
favorable to R than in the previous version.] 

15. The players have agreed that the ante will be $4. They are debating 
the size of the raise. What value of b should player R argue for? {Hint: He 
does not want the game to be fair. Then what are the possible values of 6? 
Find the value of the 2X2 game for any such b, and find its maximum value 
by trying several values of b.) 
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Chapter VII * 


APPLICATIONS TO 
BEHAVIORAL SCIENCE 
PROBLEMS 


1. SOCIOMETRIC MATRICES 

Matrices having only the entries 0 and 1 have been used by 
sociologists to analyze the structure of dominance relations in groups 
of subjects (animal or human). We shall use the notation A x A 2 to 
indicate that individual A x “dominates” individual A 2 . For example, 
in the pecking order of chickens in a barnyard, Ai dominates A 2 , that 
is, Ai A 2 means “chicken Ai pecks chicken A 2 .” As another ex¬ 
ample, suppose that A x and A 2 are athletic teams and the relation 
A\ A 2 means “team A x beats team A 2 .” 

We shall say that the relation is a dominance relation if it satisfies 
the following two properties: 

(i) It is false that Ai » Ai) that is, no individual can dominate 
himself. 

(ii) For each pair of individuals A x and A 2 either Ai^> A 2 or 
A 2 A h but not both; that is, in every pair of individuals, 
there is exactly one who is dominant. 

It has been observed that in the pecking order of chickens a domi¬ 
nance relation holds. Also, in the play of one round of a round robin 
contest among athletic teams, if ties are not allowed (as in baseball), 
then a dominance relation holds. 

The reader may have been surprised that we did not assume that, 
if Ai » A 2 and A 2 » A z , then A x » A 3 . This is the so-called transitive 
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law. A moment's reflection shows that the transitive law need not 
hold for dominance relations. Thus if team A beats team B and team 
B beats team C (in football, say), then we cannot assume that team 
A will necessarily beat team C. In almost every football season there 
are examples where “upsets" occur. 

A convenient way of depicting dominance relations is by means of 
directed graphs. Two such directed graphs are shown in Figure 1. 


^3 



^3 



Figure 1 


Individuals are represented on the graph as (lettered) points and a 
dominance relation between two individuals as a directed line segment 
(line segment with an arrow) connecting the two individuals. The 
graph in Figure 1(a) represents the situation: A\ dominates A 2 , also 
A 2 dominates A 3 , and A 3 dominates Ai. Similarly, the graph in Figure 
1(b) represents the situation: Ai dominates A 2 and A 3 , and A 2 domi¬ 
nates A 3 . These graphs represent the two essentially different domi¬ 
nance relationships that are possible among three individuals (cf. 
Exercise 1). 

Still another way in which dominance relations can be exhibited 
is by means of matrices, called dominance matrices , having only zeros 
and ones as entries. Two such matrices are shown in Figure 2. 



A\ 

a 2 

Az 


At 

a 2 

A 1 

/° 

1 

0\ 

At 

(o 

1 

a 2 \ 

0 

0 

1 

a 2 | 

0 

0 

a 3 

Vi 

0 

0/ 

A z 

V> 

0 


(a) (b) 

Figure 2 


Notice that we have labeled both the rows and the columns with 
letters of the individuals. An entry of 1 in the row of individual Ai 
and the column of individual Aj means that individual Ai dominates 
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Ay; that is, Ay. Similarly, a 0 entry there means that A * does 

not dominate Ay. The reader can check that the dominance situations 
in Figures 2(a) and (b) are the same as those in Figures 1(a) and (b), 
respectively. 

Since a dominance matrix is derived from a dominance relation, 
we can investigate the effects of conditions (i) and (ii) above on the 
entries in the matrix. Condition (i) simply means that all entries on 
the main diagonal (the one which slants downward to the right) of 
the matrix must be zero. Condition (ii) means that, whenever an 
entry above the main diagonal of the matrix is 1, the corresponding 
entry of the matrix which is placed symmetrically to it through the 
main diagonal is 0, and vice versa. To state these conditions more 
precisely, suppose that there are n individuals, and let D be a domi¬ 
nance matrix with entries dyy. Then the conditions above are 

(i) da = 0 for i = 1, 2, . . ., n. 

(ii) If i j , then dyy = 1 if and only if da = 0. 

The 1 entries in the Ah row correspond to the individuals whom A* 
dominates, and the 1 entries in the jth column correspond to the 
individuals who dominate Ay. 

Since a dominance matrix D is square, we can compute the powers 
of the matrix, D 2 , D 3 , etc. Let E = Z) 2 , and consider the entry in the 
Ah row and jth column of E. We have 


Gij — dudij + di<id<ij + . . . + dindnj . 

Now a term of the form d ik d k j can be nonzero only if both factors are 
nonzero; that is, only if both factors are equal to 1. But if dik = 1, 
then individual Ay dominates A k ; and if d k j = 1, then individual A k 
dominates Ay. In other words, A z O£> A k y> Ay. We shall call a domi¬ 
nance of this kind a two-stage dominance. (To keep ideas straight, let 
us call A y Ay a one-stage dominance.) We now can see that the entry 
dj gives the number of two-stage dominances that individual Ay has 
over individual Ay. 

For example, let D be the matrix 


0 1 1 
0 0 1 
0 0 0 
BOO 



D = 
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Then D 2 is the matrix 


D 2 = 


/0 0 1 2 > 
0 0 0 1 
0 0 0 0 
\0 0 0 0 } 


Thus we see that in this example Ai has one two-stage dominance 
over A 3 and two two-stage dominances over A 4 ; similarly, A 2 has one 
two-stage dominance over A 4 . These can be written down explicitly as 


A x » A 2 » A s , 
Ai» A 2 » A 4 , 
» A 3 » A 4 , 
A 2 » As » A 4 . 


The directed graph for this dominance situation is given in Figure 3. 



The reader should trace out on the graph of Figure 3 the two-stage 
dominances given above. 

The following theorem will be proved in the next section. 

Theorem. Let I2> be a dominance relation on a set of n individuals 
Ai, A 2 , . . ., A n . Then there exists at least one individual who can 
dominate in either one or two stages every other individual in the 
group. Also there exists at least one individual who is dominated in 
either one or two stages by every other individual in the group. 

In matrix language the theorem can be restated as follows: let 
S = D + D 2 ; then there are at least one row and one column of S 
having all but one entry (the diagonal one) nonzero. 
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To illustrate this theorem consider the dominance situation shown 
in Figure 4. The dominance matrix D and its square D 2 for this situa¬ 
tion are 


/° 

1 

0 

A 


/° 

1 

2 

°\ 

-! 

0 

0 

1 

0 

o' 

°J 

, D 2 = 

1 

0 

0 

1 

0 

0 

0 

1 

\o 

1 

1 

0/ 


\1 

0 

1 

0/ 


The matrix $ corresponding to these is 


8 = D + D 2 = 


/0 2 2 
10 1 
1 1 0 
\l 1 2 


A 

0 

1 

0 / 


Observe that Ai, A 3 , and A 4 can each dominate every other individual 
in one or two stages, but that A 2 cannot so dominate A 4 . Similarly, 



Figure 4 


each of the individuals Ai, A 2 , and A 3 is dominated in one or two 
stages by every other individual, while A 4 is not so dominated by A 2 . 
It is instructive to check these statements in the directed graph of 
Figure 4. 

As a final application of these dominance matrices, we shall define 
the power of an individual. By the power of an individual in a domi¬ 
nance situation, we mean the total number of one-stage and two-stage 
dominances which he can exert. Since the total number of one-stage 
dominances exerted by Ai is the sum of the entries in row i of the 
matrix D, and the total number of two-stage dominances exerted by 
A i is the sum of the entries in row i of the matrix Z> 2 , we see that the 
power of Ai can be expressed as: 
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The power of is the sum of the entries in row i of the 
matrix S = D + D 2 . 

In the example of Figure 4 it is easy to check that the powers of the 
various individuals are the following: 

The power of Ai is 5. 

The power of A 2 is 2. 

The power of A 3 is 3. 

The power of A 4 is 4. 

Example* (Athletic contest). The idea of the power of an indi¬ 
vidual can be used to judge athletic events. For example, the result 
of a single round of a round robin athletic event results in the follow¬ 
ing data. 

Team A beats teams B and D . 

Team B beats team C. 

Team C beats team A . 

Team D beats teams C and B . 

Then it is easy to check that this is precisely the dominance situation 
shown in Figure 4. By the analysis given above we can rate the teams 
in the following order according to their respective powers: A, D, C, 
and B. 

It should be remarked that the above definition of the power of an 
individual is not the only one possible. In Exercise 10 below we sug¬ 
gest another definition of power which gives different results. Before 
using one or the other of these definitions, a sociologist should examine 
them carefully to see which (if either) fits his needs. 


EXERCISES 

1. Show that there are only two essentially different pecking orders 
possible among three chickens, namely, those given in Figure 1. (Hint: Use 
directed graphs.) 

2. Find the dominance matrices D corresponding to the following directed 
graphs. 
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A, 


A 4 


(b) 

A 2 






3. Compute the matrices D 2 and S = D + D 2 and determine the powers 
of each of the individuals in the examples of Exercise 2. 


[Arts, (b) D 2 = 


'0 1 1 0 \ /0 2 1 l \ 

0 0 0 0 V Q = I 0 0 0 0 I 

o i o i r 1 1 2 o 1 1 

^110 0 / \1 2 1 0 / 


; 4, 0, 4, 4.] 


4. Find the powers of each of the individuals in the dominance situation 
whose matrix is 

/0 1 0 0 1 1 l\ 

0 0 1110 1 

10 0 10 10 
D= 10001 1 1. 

0 0 1 0 0 0 0 

0 10 0 10 1 

\0 0 1 0 1 0 0/ 

[Ans. The powers are: 14, 14, 14, 14, 4, 10, 6, for A\ through At, 
respectively.] 
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5. Find all the essentially different pecking orders that are possible among 

four chickens. [Ans. There are four essentially different ones.] 

6. If D is any dominance matrix, give the interpretation of the entries 
in the columns of the matrix S = D + Z) 2 . Also give the interpretation for 
the column sums of S. 

7. If D is any dominance matrix, give the interpretation for the entries 
in the matrix Z) 3 ; also give the interpretations for the row and column sums 
of Z) 3 ; do the same for the entries and the row and column sums of the matrix 
8 = D + D 2 + D\ 

[Ans. The entries in the Zth row of Z) 3 give the three-stage do mi nances 
that Ai exerts; the fth row sum of Z) 3 gives the total number of three- 
stage dominances that A* exerts. The entries in S give the one, two, 
or three-stage dominances, and the ith row sum of S gives the total 
number of such that A* exerts.] 

8. If D is any dominance matrix, give the interpretation for the entries 
in the matrix 

S = D + D 2 + Z) 3 + . . . + D n . 

Also give the interpretation for the row and column sums of this matrix. 

9. A round robin tennis match among four people has produced the 
following results: 

Smith has beaten Brown and Jones. 

Jones has beaten Brown. 

Taylor has beaten Smith, Brown, and Jones. 

By finding the powers of each player, rank them into first, second, third, and 
fourth place. Does this ranking agree with your intuition? 

[Ans. Taylor has power = 6, Smith has power = 3, Jones has power 
= 1, and Brown has power = 0.] 

10. Let the poweri of an individual be the power as defined in the text 
above. Define a new power, called power 2 , of an individual as follows: If D 
is the dominance matrix for a group of n individuals, then the power 2 of A t 
is the sum of row i of the matrix 

S' = D + ±D\ 

Find the power 2 of each of the teams in the athletic team example in the text. 
Show that the power 2 of a team need not equal his poweri. Comment on the 
result. 

11. Find the power 2 of the players in Exercise 9. Discuss its relation with 
the poweri of each of the players. 

[Ans. Taylor has power 2 = L Smith has power 2 = 3, Jones has power 2 
= 2, Brown has power 2 = f.] 
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12. Discuss and give interpretations for the entries of the matrix 

S = D + + . . . + (l/ra)Z) m . 

Give interpretations for the row and column sums of this matrix. 

13. Use the result of Exercise 5 to show that if a round robin tournament 
of four players is judged by their power, one of two things must happen: 
Either there is no tie, or there is a three-way tie. In the case of three-way 
ties show, by symmetry, that no rational criterion can be used to break these 
ties (without having playoff games, of course). 

14. In the example in the text replace “beats” by “is beaten by.” Does 

this reverse the order of the teams according to their power? Do the same 
for Exercise 9. [Arts. No; CBAD; Yes.] 

2. COMMUNICATION NETWORKS 

A communication network consists of a set of n people, call them 
Ai, A 2 , ... 9 A n , such that between some pairs of persons there is a 
communication link. A communication link may be either one-way 
or two-way. A two-way communication link might be made by tele¬ 
phone or radio, and a one-way link by sending a messenger, lighting 
a signal light, setting off an explosion, etc. We shall again use the 
symbol where Ai^> Aj now shall mean that individual A { can 
communicate with Aj (in that direction). The only requirement that 
we now put on the symbol )£> is the following: 

(i) It is false that Ai ^ Ai for any i; that is, an individual cannot 
(or need not) communicate with himself. 

Notice that we have dropped the second condition which we used in 
the preceding section—we do not require that, of every pair of indi¬ 
viduals, at least one can communicate with the other. Also, it is 
possible that both Ai^>Aj and Aj^>Ai; that is, a two-way com¬ 
munication link is possible. 

Again it is convenient to use directed graphs to represent commu¬ 
nication networks. In Figure 5 we have drawn two such. The arrows, 
of course, indicate the direction in which communication is possible. 
A double arrow indicates communication is possible in both direc¬ 
tions. 

As before, we can also represent communication networks by means 
of matrices C having only 0 and 1 entries, which we call communica - 
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tion matrices. The entry in the ith row and j'th column of C will be 
equal to 1 if A, can communicate with Aj (in that direction) and 




Figure 5 


otherwise equal to 0. Thus the communication matrices correspond¬ 
ing to the communication networks of Figure 5 are given in Figure 6. 


C = 


'0 1 0 0 \ 
10 101 
0 0 0 lj 
,0 10 0 / 


'0 1 0 0 \ 
10 0 0 
0 10 0 
,0 10 0 / 


(a) 


Figure 6 


(b) 


Notice that the diagonal entries of the matrices in Figure 6 are all 
equal to 0. This is true in general for a communication matrix, since 
a restatement of condition (i) in matrix language is that da = 0 for 
all i. It is not hard to see that any matrix having only 0 and 1 entries, 
and with all zeros down the main diagonal, is the communication 
matrix of some network. 

The square of a communication matrix has an interpretation similar 
to the interpretation of the square of a dominance matrix. If C is a 
co mmuni cation matrix, the entry in the ith row and jth column of C 2 
gives the number of two-stage communications between Ai and Aj. 
For example, the square of the matrix in Figure 6(a) is 
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( i o i o\ 

0 10 1 
0 1 0 0 ' 

10 10/ 

The entry of 1 in the upper left-hand corner indicates that A x can 
communicate “with himself” in two stages. This is indeed true, the 
communication chain being Ai y> A 2 y> Ai. One can also see, for ex¬ 
ample, that A 4 has two-stage communications with both At and A 3 . 
These and the other two-stage communications indicated in C 2 can 
easily be seen on the graph of Figure 5(a). 

As in Section 1 we shall be interested in the sum of the matrices 
C and C 2 . Let S = C + C 2 . The following theorem includes the theo¬ 
rem in Section 1 (see Exercise 7). 

Theorem. Let a communication network of n individuals be such 
that, for every pair of individuals, at least one can communicate in 
one stage with the other. Then there is at least one person who can 
communicate with every other person in either one or two stages. 
Similarly, there is at least one person who can be communicated with 
in one or two stages by every other person. 

Stated in matrix language, the above theorem is: Let C be the 
communication matrix for the network described above; then there 
is at least one row of S = C + C 2 which has all its elements nonzero, 
except possibly the entry on the main diagonal. Similarly, there is 
at least one column having this property. 

Proof. We shall prove only the first statement since the proof of 
the second is analogous. 

First we shall prove the following statement: If Ai cannot com¬ 
municate in either one or two stages with Ai, where i 9 ^ 1, then Ai 
can communicate in one stage with at least one more person than can 
A\. We prove this in two steps. First by the hypothesis of the theo¬ 
rem we see that: 

(a) If it is false that A x y> Ai, then Ai y> A x . Second we can prove 
that: 

(b) Suppose that for all k it is false that A x y> A k y> Ac, it follows 
that, if A x A k , then also Ai y> A k . 
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For if A i ^>> Ah, it is false that A k A t ; hence, by the hypothesis 
of the theorem, it is true that Ai Ah* 

Now (b) says that every one-stage communication possible for Ai 
is also possible for Ai. From this and (a), it then follows that Ai 
can make at least one more (one-stage) communication than can Ax. 

We now return to the proof of the theorem. Let rx, r%, ... ,r n be 
the row sums of the matrix C. By renaming the individuals, if neces¬ 
sary, we can assume that the largest row sum is n, that is, rx > r k 
for k = 1, 2 , . . . ,n. We shall show that Ax can communicate with 
everyone else in one or two stages. (The proof is based on the in¬ 
direct method.) Suppose, on the contrary, that there is an individual 
Ai, where i > 1, with whom Ax cannot so communicate. By the 
statement proved above, Ai can communicate in one stage with at 
least one more person than Ax can. But this implies that r* > rx, 
which contradicts the fact that we have named the individuals so 
that rx > Ti. This contradiction establishes the theorem. 


An additional conclusion which can be made from the proof of the 
theorem is that the individual or individuals having the largest row 
sum in the matrix C can communicate with everyone else in one or 
two stages. Similarly, the individuals having the largest column sum 
can be communicated with by everyone in one or two stages. 

The network shown in Figure 
7 satisfies the hypothesis of the 
theorem, hence its conclusion. 
The communication matrix for 
this network is 




Here the maximum row sum of 2 occurs in rows one, three, and four, 
so that Ax, Az, and A 4 can communicate with everyone else in one or 
two stages. (Find the necessary communication paths in Figure 7.) 
However, it requires three stages for A 2 to communicate with Ax. 
The maximum sum of 3 occurs in column two so that A 2 can be com¬ 
municated with by everyone else in one or two stages (actually one 
stage is enough). It happens also that A z and A 4 can also be com- 
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municated with in one or two stages; however, as observed above, 
Ai cannot be. 

Neither of the networks in Figure 5 satisfies the hypothesis of the 
theorem. It happens that the network in Figure 5(a) does satisfy the 
conclusion of the theorem, while the network in Figure 5(b) does not. 
(See Exercise 4.) 


EXERCISES 


1. Find the communication matrices for the following communication 
networks. 




2. Draw the directed graphs corresponding to the following communica¬ 
tion matrices. 


o i r 
0 0 1 
4 1 0 , 


(b) 


'0 1 0 l\ 
10 0 1 ] 
o o o i r 
, 1110 / 


(a) 


(Cont. on 
page 320) 
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(c) 


'0 1 0 1 \ 
10 10 
10 0 1 
,0 110 / 


(d) 


'0 0 0 0 \ 
0 0 0 l| 
1 0 0 0 ' 
,0 0 10 / 


3. Which of the communication networks whose matrices are given in 
Exercise 2 satisfy the hypothesis of the theorem of this section? 

[Ans. (a) and (c).] 


4. Show that the network in Figure 5(a) satisfies the conclusion of the 
theorem, while the network in Figure 5(b) does not. 


5. By computing the matrix S in each case, find the persons who can 
co mm unicate with everyone else in one or two stages and those who can be 
communicated with in one or two stages, for the communication matrices in 
Exercise 2. (In some cases such persons need not exist. See Exercise 3.) 

[Ans. (a) Everyone, (b) Everyone, (d) Neither type of person exists.] 

6. Find all communication networks among three individuals which 

satisfy the hypothesis of the theorem of this section. How many of these are 
essentially different? [Ans. There are seven.] 

7. Show that the theorem stated in the last section follows from the 
theorem proved in the present section. 

8. If C is a communication matrix, give an interpretation for the entries 
of the matrix ( 7 3 . Do the same for the matrix C 4 . 

[Ans. The entry in row i and column j of C 3 gives the number of three- 
stage communications from i to j ; the same entry of C 4 gives the 
number of four-stage communications from i to j.] 

9. If C is a communication matrix, give an interpretation for the entries 
of the matrix S = C + C 2 + C 3 + . . . + C m . 


10. Prove the second statement of the theorem of the present section. 


11 . Prove that the following statement is true: In a communication net¬ 
work involving three individuals, it is possible for a message starting from 
any person to get to any other person if and only if the following condition 
is satisfied: each individual can send a message to at least one person and can 
receive a message from at least one person. 

12. Show that the matrix form of the condition in Exercise 11 is: every 
row and column of the communication matrix must have at least one nonzero 
entry. 

13. Is the statement in Exercise 11 true for a communication network 
involving two individuals? For four or more individuals? [Ans. Yes; no.] 
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3. STOCHASTIC PROCESSES IN GENETICS 

The simplest type of inheritance of traits in animals occurs when a 
trait is governed by a pair of genes, each of which may be of two 
types, say G and g. An individual may have a GG combination or Gg 
(which is genetically the same as gG) or gg. Very often the GG and Gg 
types are indistinguishable in appearance, and then we say that the 
G gene dominates the g gene. An individual is called dominant if he 
has GG genes, recessive if he has gg, and hybrid with a Gg mixture. 

In the mating of two animals, the offspring inherits one gene of the 
pair from each parent, and the basic assumption of genetics is that 
these genes are selected at random, independently of each other. This 
assumption determines the probability of every type of offspring. 
Thus the offspring of two dominant parents must be dominant, of two 
recessive parents must be recessive, and of one dominant and one 
recessive parent must be hybrid. In the mating of a dominant and a 
hybrid animal, the offspring must get a G gene from the former and 
has probability \ for getting G or g from the latter, hence the prob¬ 
abilities are even for getting a dominant or a hybrid offspring. Again 
in the mating of a recessive and a hybrid, there is an even chance of 
getting either a recessive or a hybrid. In the mating of two hybrids, 
the offspring has probability \ for getting a G or a g from each parent. 
Hence the probabilities are i for GG, | for Gg, and \ for gg. 

Example 1 . Let us consider a process of continued crossings. We 
start with an individual of unknown genetic character, and cross it 
with a hybrid. The offspring is again crossed with a hybrid, etc. The 
resulting process is a Markov chain. The states are “dominant / 1 
“hybrid,” and “recessive.” The transition probabilities are 




d 

h 

r 


d i 
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1 

2 

°\ 
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P = h\ 
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as can be seen from the previous paragraph. The matrix P 2 has all 
entries positive (see Exercise 1), hence we know from Chapter V, 
Section 7, that there is a unique fixed point probability vector, i.e., a 
vector p such that pP = p. By solving three equations, we find the 
fixed vector to be p — (i,i,i). Hence, no matter what type the origi- 
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nal animal was, after repeated crossing we have probability nearly \ 
of having a dominant, \ of having a hybrid, and J of having a recessive 
offspring. 

Example 2. If we keep crossing the offspring with a dominant 
animal, the result is quite different. The transition probabilities are 

/ 1 0 °\ 

(2) P' = i * 0- 

\0 1 0 / 

No power of this is all positive, and hence our general theorem does 
not apply. If we solve for p, we find that (1,0,0) is a unique prob¬ 
ability vector fixed point. But here the components are not all posi¬ 
tive. Therefore, after sufficiently long time has elapsed, we can be 
almost certain that we have a dominant offspring. This is easy to 
verify. Even if we start with a recessive animal, after a single cross¬ 
ing, the offspring cannot be recessive. It may be hybrid, but the 
probability of having a hybrid n times in a row is (J) n , which tends 
to zero. And once we have a dominant offspring, all future animals 
will be dominant. The analysis for crossing with recessive animals is 
very similar (see Exercise 2). 

We can interpret our results for crossings of large numbers of ani¬ 
mals. If a given population is crossed with hybrids, and the offspring 
are all crossed with hybrids, etc., then eventually we will have ap¬ 
proximately J dominants, \ hybrids, and \ recessives. While if we 
keep crossing them with dominants, then after sufficiently many 
crossings we can expect only dominants. 

In Example 1 we may ask a more difficult question. Suppose that 
we have a regular matrix P (as in Example 1), with states Si,. . ., $ n . 
The process keeps going through all the states. If we are in Si, how 
long, on the average, will it take for the process to return to si ? We 
can even ask the more general question of how long, on the average, 
it takes to go from Si to sj. 

The average here is taken in the sense of an expected value. There 
is a probability p x that we reach s 3 - for the first time in one step, 
P 2 that we reach it first in two steps, etc. The expected value is 
Pi-1 + £>2*2 + . . . . (See Chapter IV, Section 12.) This, in general, 
requires a difficult computation. However, there is a much simpler 
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way of finding the expected values. Let the expected number of steps 
required to go from stage Si to Sj be m#. How can we go from Si to 
sj? We go from Si to s k with probability p ik which is one step. If 
k — j, we are there. If k 9 * j, it takes an average of m k j steps more. 
Hence mu is the sum of pi k m kj for all k 9 * j, plus 1. Let M be the 
matrix obtained from M by placing zeros in all the diagonal entries 
mu, and let C be the square matrix having all entries equal to 1. Then 
our equations can be replaced by the single matrix equation 

(3) M = PM + C. 

The mij can be found by solving these simultaneous equations (see 
Exercise 8). Let us multiply both sides of the equation by P n . Ob¬ 
serve that PC = C; hence P n C — C (see Exercise 9). 

(4) P n M = P n + l M + C. 

We know that, for a regular P, P n approaches a matrix each of whose 
rows is the fixed point vector p. Hence all the rows of (4) approach 
the same vector equation, 

(5) pM = pM + (1, . . ., 1) 
or 

(6) p(M - W) = (1, ..., 1). 

But all components of M — M except the diagonal ones are 0. Hence 
our equation simply states that pmu — 1 for each i. This tells us that 
mu = 1/pu The average time it takes to return from Si to Si is \ the 
reciprocal of limiting probability of being in Su In Example 1 this means 
that if we have a dominant offspring we will have another dominant 
in an average of four steps, after a hybrid we have another hybrid 
in an average of two steps, and a recessive follows a recessive on the 
average in four steps. 

Example 3. A more interesting, and also more complex, process is 
obtained by crossing a given population with itself, and then crossing 
the offspring with offspring, etc. Let us suppose that our population 
has a fraction d of dominants, h hybrids, and r recessives. Then 
d + h + r = 1. If the population is very large and they are mated 
at random, then (by the law of large numbers) we can expect d 2 to be 
the fraction of matings in which both parents are dominant, 2 dh 
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the fraction of mating a dominant with a hybrid, etc. Hence we 
have simple formulas for the (approximate) fraction of offspring 
of various types. We will compute the fraction of dominants as an 
example. 

To have a dominant offspring we must have mated two dominant 
parents, or a dominant and a hybrid parent, or two hybrids. In the 
first case we always get a dominant offspring, in the second the prob¬ 
ability is in the third case it is Hence the fraction of dominant 
offspring is 

d 2 + i-2 dh + \h 2 = d 2 + dh + 


If we represent the fractions in a given generation by a row vector, 
the process may be thought of as a transformation T which changes 
a row vector into another row vector. 

(7) (d,h,r) • T=(d 2 + dh + J h 2 , dh + rh + 2 dr + \h 2 , r 2 + rh + 1 h 2 ). 

The trouble is that (see Exercise 3) the transformation T is not linear. 
Nevertheless, we know that after n crossings the distribution will be 
(d,h,r)T n , so that, if we can get a simple formula for T n , we can de¬ 
scribe the results simply. And here luck is with us. 

Let us compute T 2 , i.e., find what happens if we apply twice the 
transformation specified above. The first generation of offspring is 
distributed according to the formula (7). We now take the first com¬ 
ponent on the right side as d , the second as h , and the third as r, and 
compute d 2 + dh + \h 2 , etc. Here we find to our surprise that T 2 = T. 
Hence T n = T. 

This means that ( d,h,r)T = ( d,h,r)T n , which in turn means that 
the distribution after many generations is the same as in the first 
generation of offspring. Hence we say that the process reaches an 
equilibrium in one step. It must, however, be remembered that our 
fractions are only approximate, and are a good approximation only 
for very large populations. 

For the geneticist, this result is very interesting. It shows that, in a 
population in which no mutations occur and selection does not take 
place, “evolution” is all over in a single generation. 

To the mathematician the process is interesting since it is an ex¬ 
ample of a quadratic transformation, a transformation more complex 
than the linear ones we have heretofore studied. 
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EXERCISES 

1. From (1) compute P 2 , P 3 , P 4 , and P 5 . Verify that P 2 > 0 and that 
the powers approach the expected form (see Chapter V, Section 7) 

2. Set up the matrix corresponding to P' in (2) for the case of repeated 
crossing with recessive animals. Find the fixed point probability vector, and 
interpret it. 

3. Prove that T is not a linear transformation. {Hint: check the con¬ 
ditions on linearity given in Chapter V, Section 9, and show by means of an 
example that T does not have one of these properties.) 

4. In the text we computed the first component of (7). Verify that the 
other two are correctly given. 

5. Compute T 2 by taking the first component of (7) as d, the second as h, 
the third as r, and substituting into the formula (7). Making use of the fact 
that d + h + r = 1, show that T 2 = T . 

6. A fixed point of T is a vector such that ( d,h,r)T = ( d 7 h,r ). Write the 
conditions that such a vector must satisfy, and give three examples of such 
fixed vectors. What is the genetic meaning of such a distribution? 

[Ans. For example, (£, !<).] 

7. In the matrix P the second row is equal to the fixed point vector. 
What significance does this have? 

8. For Example 1 write the matrix M with unknown entries m*,-. Write 
M by replacing mu, m 22 , and m 33 by zeros. Then solve the nine simultaneous 
equations given by (3), to find the mu. Check that mu = 1/pi. 

[Ans. mu = 4; mi 2 = 2, mi 3 — 8.] 

9. From the definition of a stochastic matrix (Chapter V, Section 7), 
prove that PC = C. 

10. Prove that, if P is a regular n X n stochastic matrix having column 
sums equal to 1, then it takes an average of n steps to return from any state 
to itself. (Cf. Chapter V, Section 7, Exercise 8.) 

11. It is raining in the Land of Oz. In how many days can the Wizard of Oz 
expect to go on a picnic? (Cf. Chapter V, Section 8, Exercise 1.) [Ans. 4.] 

The remaining exercises develop a simpler method of treating the nonlinear 
transformation T, in the text above. 

12. Let p be the ratio of G genes in the population, and q = 1 — p the ratio 
of g genes. Express p and q in terms of d , h, and r. 

[Ans. p = d + hh, q = r A- hh.] 
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13. Suppose that we take all the genes in the population, mix them thor¬ 
oughly, and select a pair at random for each offspring. Show, using the 
result of Exercise 12, that the resulting distribution of dominant, hybrid, and 
recessive individuals is precisely that given in (7). 

[Arts. (d,h,r)-T = (p 2 ,2pq,q 2 ).] 

14. If we write (d,&,r) • T = (d',h',r '), show, using the result of Exercise 13, 
that h' 2 — 4:d'r'. 

15. Show that for equilibrium it is necessary that h 2 = 4dr. 

16. Show that if h 2 = 4 dr, then p 2 = d, q 2 = r, and 2 pq = h. Hence show 
that this condition is also sufficient for equilibrium. 

17. Use the results of Exercises 14-16 to show that the population reaches 
equilibrium in one generation. 

4. ABSORBING MARKOV CHAINS AND GENETICS 

There was an essential difference between the results of the first 
two examples in the last section. In Example 1 the process could 
have been in any one of the three states after a long time, and all 
we knew was what the three probabilities were. These we were able 
to obtain from the fact that we had a unique probability vector fixed 
point with all its components positive. In that process the fixed point 
furnished all the interesting information. 

In Example 2 the fixed point told us only that eventually the proc¬ 
ess would end up in the first state, and would stay there. The prin¬ 
cipal characteristic of the first state is that, once the process enters 
this state, it cannot leave it. Such a state is described as absorbing . 

Definition. A state in a Markov chain is an absorbing state if it is 
impossible to leave it. 

Definition. A Markov chain is called absorbing if (1) it has at least 
one absorbing state, and (2) from every state it is possible to go to an 
absorbing state (not necessarily in one step). 

Theorem. In an absorbing Markov chain it is certain that the 
process will end up in one of the absorbing states. 

We shall indicate only the basic idea of the proof of the theorem. 
From each nonabsorbing state, Sj, it is possible to reach an absorbing 
state. Let nj be the minimum number of steps required to reach an 
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absorbing state, starting from state sj. Let p 3 - be the probability that, 
starting from state Sj , the process will not reach an absorbing state 
in 7ij steps. Then p 3 - < 1. Let n be the largest of the Uj and p be the 
largest of the pj. The probability of not being absorbed in n steps 
is less than p, in 2 n steps is less than p 2 , etc. Since p < 1, these 
probabilities tend to zero. 

For an absorbing Markov chain we consider three interesting ques¬ 
tions: (a) What is the probability that the process will end up in a 
given absorbing state? (b) On the average, how long will it take for 
the process to reach an absorbing state? (c) On the average, how 
many times will the process be in each nonabsorbing state? The 
answer to all these questions depends, in general, on which state the 
process starts from. 

If there are at least two absorbing states, Si and s 2y the answer to 
question (a) must depend on the starting position. If the process 
starts in Si, it ends up in si with probability 1, while, if the process 
starts in s 2 , it ends up in Si with probability 0. On the other hand, if 
Si is absorbing and s 2 is not, the answers to questions (b) and (c) de¬ 
pend on the starting point. If the process starts in s h the answer to 
both questions is 0. But if it starts in s 2 it takes at least one step 
to reach an absorbing state, and it will be in at least one nonabsorbing 
state once (namely s 2 ). (See Exercise 1.) 

Let P be the matrix of transition probabilities for an absorbing 
Markov chain. Let s be an absorbing state. We want to find the 
probability that the process ends up in s, and this depends on where 
it starts. Let di be the probability of the process ending up at s, if it 
starts at S{. Form the column vector d having di as its ith component. 
From state i the process can go to state j with probability p#. If it 
does, there is a probability dj of going on from state j to s. Hence 
Pijdj is the probability of going from to s via Sj. The sum of these 
terms gives the total probability of going from to s. Hence we will 
have, 

di — pndi + pi 2 d 2 + . . . + Pindn , 

for i — 1, 2, . . . , ft. But the sum on the right side is simply the ith 
component of Pd; hence we can shorten our equations to the single 
vector equation 


a) 


d = Pd. 
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We note that d would be a “fixed point” of P, except that it is a 
column vector instead of a row vector. For an absorbing Markov 
chain there frequently is more than one probability vector solution 
of (1). We can select the one we want by imposing the conditions 

(2) The component of d corresponding to s is 1; cor- 
responding to any other absorbing state is 0. 

This condition gives us a unique fixed vector d for every absorbing 
state, the vectors differing only with respect to (2). 

Next we will study question (c), since the answer to question (b) 
will be a direct consequence. Let the process start in a nonabsorbing 
state Si . How many times do we expect it to be in nonabsorbing state 
Sj? (Here “expect” is to be taken in the sense of Chapter IV, Section 
12.) Let us call this number Uj. From $», it moves to state Sk with 
probability pa. If s k is an absorbing state, it will never come to Sj. 
If it is nonabsorbing, we expect it to come to Sj a total of 4y times. 
Hence pidkj has to be summed for all nonabsorbing states Sk . But we 
do not yet have the entire answer, for if i = j, then we must not forget 
that we started in sj, and hence we must add 1. For Si, Sj nonabsorb¬ 
ing, 

tij = pi\t\j ~f" * • • “I” P intnj ( + 1 if i — j)> 
summed over the 
nonabsorbing states 

We can rewrite all these equations as a single matrix equation. Let 
us form the truncated matrix Q which is obtained from P by crossing 
out all rows and columns corresponding to absorbing states. Let a be 
the number of absorbing states; then Q is an (n — a) X (n — a) 
matrix. The matrix T 7 , whose components are Uj, which are defined 
only when Si and Sj are nonabsorbing, is the same size as Q. The sum 
on the right side above is simply the product of row i of Q with column 
j of T; hence it is component ij of QT. To this we want to add 1 if 
i = j* hence we want to add the corresponding component of the 
identity matrix. This yields the matrix equation 

(3) T = QT + I. 

We can rewrite equation (3) as 

I = T -QT = (I - Q)T . 



Sec. 4] 


APPLICATIONS TO BEHAVIORAL SCIENCE 


329 


Multiplying both sides by (/ — Q ) *, we obtain the solution 

(4) T = (/ - Q)- 1 . 

Thus we find that the components of (I — Q) -1 provide the answer 
to question (c). They also provide the answer to question (b). Let 
us suppose that we want to know how many steps it takes the process 
on the average to reach an absorbing state from s* (nonabsorbing). 
There will be one step for each time that it is in a nonabsorbing state. 
Hence the total number of times it is in a nonabsorbing state is the 
same as the number of steps it takes to reach an absorbing state. But 
this total is simply the number of times on the average that the 
process is in each nonabsorbing state, starting with Si , which is the 
sum of row i of (I — Q) -1 . Hence the row sums of (4) provide 
the answer to question (b). Let us write this in vector form. Let ti 
be the expected number of steps that take us from a nonabsorbing 
Si to an absorbing state, and let t be the column vector having U as 
components. Then t consists of the row sums of (4). By using the 
column vector c having all components equal to one, we can write 

(5) t = (/ - 

Example 1. Let us return to equation (2) of the last section. 
P' represents a chain with one absorbing state, Si. If we try to solve 
the equation d = P'd, we find that the solution can be any column 
vector all of whose components are the same. Using condition (2), 
we know that its first components must be 1; hence all components 
are 1. Thus 



This means that no matter where the process is, the probability of 
ending up in Si is 1. This agrees with our first theorem, since there is 
only one absorbing state. 

Next we form the truncated matrix 


S2 S 3 
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then (7 - Q') = (^J_ l 

and T = (7 - Q ') _1 = Q ?} 

Hence we see that if the process starts in s 2 , we can expect it to be 
in this state twice (including the starting position). If we are in S 3 
we can expect it to be there only once, namely at the start, which is 
clear since it is impossible for it to return to s 3 . And starting from S 3 
we can expect it to be in s 2 twice. 



Hence we expect it to reach s± in two steps from s 2 and in three steps 
from s 3 . 

Example 2. Let us construct a more complicated example of an 
absorbing Markov chain. We start with two animals of opposite sex, 
cross them, select two of their offspring of opposite sex and cross those, 
etc. To simplify the example we will assume that the trait under con¬ 
sideration is independent of sex. 

Here a state is determined by a pair of animals. Hence the states of 
our process will be: Si = (D,D), s 2 = (. D,H ), $ 3 = (D,R), s 4 = (H,H), 
s 5 = ( H,R ), and s 6 = ( R,R ). Let us illustrate the calculation of tran¬ 
sition probabilities in terms of s 2 . When the process is in this state, 
one parent has GG genes, the other Gg. Hence the probability of a 
dominant offspring or a hybrid offspring is \ for each. Then the 
probability of transition to Si (selection of two dominants) is 
transition to s 2 is and to $4 is The transition matrix is 

Sl S 2 $ 3 Si So $6 

Si /1 0 0 0 0 o\ 

S 2 [ i 0 } 0 0 

p „ S3 0 0 0 1 0 0 

si it i I i i w' 

s. 0 0 0 i i i 

s 6 \0 0 0 0 0 1/ 

There are two absorbing states, Si and Se. The probabilities of absorp- 
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tion in si are found by solving d 
and d§ — 0 giving 


d 


Since the sum of these vectors d must be c (see Exercise 6) it follows 
that the other vector, giving the probability of absorption in s 6 is 
c — d. 

The genetic interpretation of absorption is that after a large num¬ 
ber of inbreedings either the G or the g gene must disappear. It is 
also interesting to note in the vector d that the probability of ending 
up entirely with G genes, if we start from a given state, is equal to the 
proportion of G genes in this state. 

Next we form the truncated matrix 


= P"d , with the conditions d\ — 1 



and 


and 


and 
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Hence we see that, if we start in a state other than ( D,D ) or (R,R), 
we can expect to reach one of these states in about five or six steps. 
The exact expected times are given by the entries of t. The matrix T 
provides more detailed information, namely how many times we can 
expect to have offspring of the types ( D,H ), (7),72), ( 77,77), and (77,72), 
starting from a given nonabsorbing state. And the vectors d and 
c — d give the probabilities of ending up in Si or $ 6 , respectively. 
These quantities jointly give us an excellent description of what we 
can expect of our process. 


EXERCISES 

1. Prove that in an absorbing Markov chain: 

(a) The probability of reaching a given absorbing state is independent 
of the starting state if and only if there is only one absorbing state. 

(b) The expected time for reaching an absorbing state is independent 
of the starting state if and only if every state is absorbing. 

2. Verify that the inverse (I — Q') -1 i s correctly given in the text. 

3. Verify that the inverse (7 — Q") -1 is correctly given in the text. 

4. Solve the equation d = P'd (see page 322). 

5. Find two solutions of d — P"d , corresponding to absorption in Si and 
$6, respectively. Verify that their sum is c (i.e., a vector all of whose com¬ 
ponents are 1). 

6. Consider all vectors d which represent probabilities of absorption in a 
given absorbing state. Interpret the sum of two such vectors. Interpret the 
sum of all such vectors. What must the sum of all these vectors be? 

7. Find two different probability vector fixed points of P". 

[Ans. (1,0,0,0,0,0); (0,0,0,0,0,1).] 

8. There is an alternate method of computing T: we want to know the 
probability of being in Sj after n steps if we start in s i} and sum this for all n. 
The sum of these will be Uj. 

(a) Show that the probability after zero steps is given by I. 

(b) Show that the probability after n > 0 steps is Q n . 

(c) Show that 7 7 = 7 + Q4 _ Q 2 + Q 3 +*‘-- 

(d) Compute the sum of this series as if it were an ordinary geometric 
series. 

(e) Verify that the answer is the same as (4). 
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9. There is a simpler method of finding t. If U is the expected number of 
steps for reaching an absorbing state from Si, this must be the same as taking 
one more step and then adding p%jtj for every nonabsorbing state 8j. 

(a) Give reasons for the above claim that U = 1 + sum of patj over 
nonabsorbing states. 

(b) Write these equations as a single vector equation. 

(c) Solve for t. 

(d) Verify that the solution is the same as (5). 


10. Suppose that hybrids have a high mortality rate; say that half of the 
hybrids die before maturity, while only a negligible number of dominants 
and recessives die before maturity. 

(a) In Example 2 above, modify the matrix P" to apply to this situa¬ 
tion. 

(b) What are the absorbing states? 

(c) Verify that it is an absorbing chain. 

(d) Find the vectors d representing the probabilities of absorption in 

the various absorbing states. / l\ 


[Arts. For si, d = 


A 

1 

2 
1 
2 

To . 

\ 0 / 


.] 


(e) Find T, and interpret. 

(f) Find t, and interpret. 


[Ans. t = 



The remaining problems concern the inheritance of color-blindness, which 
is a sex-linked characteristic. There is a pair of genes, C and N , of which the 
former tends to produce color-blindness, the latter normal vision. The N 
gene is dominant. But a man has only one gene, and if this is C, he is color¬ 
blind. A man inherits one of his mother’s two genes, while a woman inherits 
one gene from each parent. Thus a man may be of type C or N, while a 
woman may be of type CC or CN or NN. We will study a process of in- 
breeding similar to that of Example 2. 

11 . List the states of the chain. (Hint: There are six.) 


12. Compute the transition probabilities. 

13. Show that the chain is absorbing, and interpret the absorbing states. 
[Ans. In one the N gene disappears, in the other the C gene is lost.] 
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14. Prove that the probability of absorption in the state having only C 
genes, if we start in a given state, is equal to the proportion of C genes in that 
state. 

15. Find T, and interpret. 

16. Find t, and interpret. 

; if we start with both C and N genes, we can expect one of 
these to disappear in five or six crossings.] 


5. THE ESTES LEARNING MODEL 

In this section we shall discuss a mathematical model for learning 
proposed by W. K. Estes. We shall not give the most general theory, 
but only some special cases. 

The theory was developed to explain certain kinds of learning which 
can be illustrated by experiments of the following kind. Suppose for 
example that a rat is put in a T maze (see Chapter III, Section 2) 
and goes either right or left. The experimenter places food on one 
side, and if the rat goes to the correct side he is rewarded. This ex¬ 
periment is then repeated many times, using some particular feeding 
schedule. The interest here lies in trying to predict the behavior of 
the rat under the different feeding schedules. For example, if the food 
is always placed on the right side, will the rat eventually learn this 
and always go right? 

A similar experiment, performed with a human subject, is the fol¬ 
lowing. A subject is given a sequence of heads and tails and each time 
is asked to guess what the next choice will be. He is to try to get as 
many right as possible. Again there are various ways that the ex¬ 
perimenter can produce his sequences of H’s and T’s, and the interest 
lies in how the subject will react to different choices. 

In the Estes model it is assumed that there are a finite number of 
elements, called “stimulus elements.” At any given time each of these 
elements is connected either to a response R 0 or to a response Ri. 
These connections are allowed to change from experiment to experi¬ 
ment. 

In a single experiment there is a certain probability 6 that any 
particular stimulus element will be sampled by the subject. To say 
that an element is sampled is the same as to say that it has an effect 


[Ans. 
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upon the subject on that experiment. It is assumed that elements 
sampled and connected to R 0 influence the subject in the direction 
of producing an R 0 response, and those sampled and connected to R\ 
to produce an Ri response. 

The samplings of the various elements are assumed to be an inde¬ 
pendent trials process (see Chapter IV, Section 8). Thus, for example, 
if there are three stimulus elements a, b , and c, the probability that a 
is sampled, b is not sampled, and c is sampled would be 0(1 — 6)0. 

We also assume that the experimenter takes one of two possible 
“reinforcing” actions, A 0 or Ai. This action may be taken before or 
after the subject's choice, but we assume that the subject learns of the 
choice of the experimenter only after he has made his own choice. In 
most experiments the subject would like to make Ro, if the experi¬ 
menter makes A 0) and Ri if the experimenter chooses Ai. We shall 
say that the subject “guesses correctly” if he matches the choice of 
the experimenter, i.e., does R 0 when the experimenter does Aq , or Ri 
when the experimenter does A\. In some experiments (e.g., the rat 
experiment above), he is rewarded if he does guess correctly the choice 
of the experimenter. 

The following two basic assumptions are made: 

Assumption A . The probability that the subject makes response 
Ei is equal to the proportion of elements in the set sampled that are 
connected to R\. If no elements are sampled, the responses are as¬ 
sumed equally likely. 

Assumption B . If, in a given experiment, the experimenter chooses 
A 0y then all the elements that were sampled on this experiment, and 
that were connected to Ri, have their connections changed to R 0 . If 
the experimenter chooses A h then all the elements sampled and con¬ 
nected to Ro have their connections changed to Ri. 

Note that in a single experiment only the set of elements that are 
actually sampled play a role, and these are the only elements whose 
connections can be changed by this experiment. In general, however, 
a different set will be sampled on each experiment, so that all the 
elements will at some time have an effect. 

By assumptions A and B it is clear that the future choices of the 
subject are going to depend upon the choice of the experimenter. 
Therefore we must describe the method that the experimenter uses 
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to determine his A’b. Typical schemes that have been used in actual 
experiments are the following: 

(i) Choose A 0 with probability p, independent of the choice of 
the subject. 

(ii) Make the same choice as the subject made (i.e., choose Ao 
if he chose R 0 , Ax if he chose Ri). 

(iii) Choose A 0 if the response of the subject on the previous ex¬ 
periment was R 0 . Choose Ao and Ai with equal probabilities 
if his response was Ri. 

We can describe a general class of schemes of the above kind as 
follows: We assume that the experimenter chooses A\ with prob¬ 
ability a , if the subject made response Ro on the previous experiment; 
and chooses Ao with probability b, if the subject made response Ri 
on the last experiment. We can represent the choices of the experi¬ 
menter for each choice of the subject by the matrix 



Thus in the above examples, (i) is the case 1 — a = b = p, (ii) is 


Connections 
at beginning 


Subject 

samples 



Subject Experimenter Connections 

chooses chooses changed 


j r °,$°\ 



No change_ 


can occur 
Figure 8 


j r 1 ,s 1 [ 
j r°,s°| 



1 S 0( 


rV 


Probability 
of branch 

•|0 2 (l-a) 

i 0 2 a 

ie 2 b 

ie 2 a -b) 
ea-eib 
ea - 9 ) 11 - 6 ] 

0(1-0) (l-c4 
0(1-0|a 
( 1 - 0) 2 
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the case a = 0, b = 0, and (iii) is the case a = 0, 6 = £. We shall 
consider throughout this section and the next the case of two stimulus 
elements (called r and s). The analysis for a larger number of elements 
is similar but more complicated. Many of the results do not depend 
upon the number of stimulus elements assumed. 

In Figure 8 we have indicated by a tree the various stages in a 
single experiment. We label the elements by superscripts 0 or 1 to 
indicate that an element is connected to R 0 or Ri, Let us assume that 
initially r is conditioned to Ri, and s to Ro; that is, we write { r l ,s °} 
as our starting point. 

If we consider a sequence of experiments, we can obtain a Markov 
chain as follows. We take as states the number of elements connected 
to Ri at any given time. Thus we can label the states 0, 1, and 2. 
Since all our probabilities depend only on the number of elements 
sampled, {r 1 ,^ 0 } and {r 0 ,^ 1 } may be thought of as the same state. 
This justifies us in taking the number of the elements conditioned to 
Ri as representing our state. The transition probabilities are then 
found as follows. To find pi, 0 we look on our tree for all paths leading 
to connection {rV 0 }, and add their probabilities. Doing this, we 
obtain 


Pi,o = £0 2 (1 - a) + £0 2 6 + 0(1 - 0)b. 


We can also find from the tree pi,i and pi, 2 . To find the other 
transition probabilities we would have to construct a similar tree, 
assuming that we started with a case where no elements were con- 
nected to Ri and also a case where both elements were so connected. 
(See Exercise 1.) When this is done we obtain the complete matrix 
of transition probabilities. 


0 

0 /(I — 6) 2 a + 1 — cl 

1 £0 2 (1 - a) 

P =* + £0(2 - d)b 



e*b 


i 

20(1 - 6)a 

(1 - 0) 2 + 0(1 — 0)(1 - a ) 
+ 0(1 - 0)(1 - b ) 

20(1 - 0)6 


2 

0 2 a \ 

£ 0*(1 - b ) 

+ £0(2 - 0)a 

(1 - 0) 2 6 
+ (1-6) / 


In the next section we shall study this Markov chain in more detail. 
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EXERCISES 


1. Construct a tree to show the possibilities for the connections after an 
experiment if the two stimulus elements are both connected to R x at the 
beginning of the experiment. Do the same for the case of no elements con¬ 
nected to Ri at the beginning of the experiment. 

2. Using the trees in Exercise 1, verify that the transition probabilities 
po,j and p 2 ,j given above are correct. 

3. What is the probability that the subject will make response R x if at 

the beginning of the experiment one element is connected to each response? 
What is this probability if at the beginning of the experiment both elements 
are connected to response R/! [Ans. h; i + 0 — J0 2 .] 


In the following exercises, find the matrix of transition probabilities under 
the special assumptions given in the problem. State whether the resulting 
Markov chain is absorbing or regular. Give an interpretation for each of the 
special cases in terms of the actual experiment. If the process is regular, find 
the limiting probabilities. If the process is absorbing, find the expected 
number of steps before absorption for each possible starting state. (See 
Section 4, Exercise 9.) 

4. a = 1, b = 1, 0 = [Ans. Regular; (.3, .4, .3).] 

5. a = 1, b = 0. [Ans. Absorbing; k = (3 — 20 )/(20 - 0 2 ); t x = 1/0.] 

6 . a — b = J, 0 = . 1 . 

7. a = 0, b = 0 = J. 

8. a = 1, b = f, 0 = 

9. a = 0, b — 0. 

10 . 0 = 0 . 


11. Consider the case 0 = 1, 0 < a < 1, and 0 < b < 1. Show that the 
matrix of transition probabilities is not regular and not absorbing. 


12. In the case of Exercise 11, show that if the process starts in state 0 or 
2, it never reaches state 1. Hence this state can be removed. Show that, if 
this is done, the resulting two-state Markov chain has a regular transition 
matrix. Find the limiting probability. [Ans. ( b a \ 


(b a \ 

\a + &’ a -r 6/ 


13. In the case of Exercise 11, show that there is a limiting probability of 
being in each of the states which is independent of the starting state. Find 
the limiting probabilities. Compare your answer with Exercise 12. 
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6. LIMITING PROBABILITIES IN THE ESTES MODEL 

We wish now to study the limiting probabilities that the subject 
and that the experimenter will choose each of the possible alternatives. 
We are primarily interested in the effects of the stimulus elements. 
When the subject does not sample any stimulus element, they have 
no effect. Thus we shall calculate all our probabilities under the as¬ 
sumption that the subject samples at least one element. This amounts 
simply to throwing out all experiments on which the subject does not 
sample any element. Thus in the present section we shall assume that 
this has been done, and in any reference to experiments, it will be 
understood to mean an experiment on which the subject sampled at 
least one element. 

With the above convention, if our process is in state 0 on a given 
experiment, then the probability that the subject will make response 
Ro is (by assumption A) equal to 1. If it is in state 1, then by sym¬ 
metry this probability is §. If it is in state 2, it is (by assumption A) 
equal to 0 . 

The matrix P will be regular if and only if the quantities a , b , 6 , 
and 1 - 6 are all not zero (see Exercise 1). If the matrix is regular, 
then there will be a limiting probability for being in each of the states. 
These probabilities can be represented by a vector p — (po,Vi 9 Jh) and 
found by solving the equations 

pP = p. 

If these equations are solved, we obtain 

_ be + 2 b 2 (i - e) 

p0 ~ (a + b)6 + 2 (a + b) 2 ( 1 - 0) ’ 

_ 4aft(l - 0) _, 

Pl ~ (a + b)0 + 2(o + fc) 2 (l - 6 ) ’ 

_ <g + 2a 2 (1 - 0) 

P2 ~ (a + b)e + 2(0 + 6) 2 ( 1 - 6) 

From these probabilities we can find that the limiting probability 
that the subject will make response Ro is 

b 

a + b’ 


l-po + hVi + = 
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and that the limiting probability that the subject makes response Ri 
is a/(a + b). 

To find the probability that the experimenter makes the choice A 0 , 
we must multiply the probabilities for each of the choices of the 
subject, by the probabilities that the experimenter does Ao if the 
subject made the particular choice. Thus the limiting probability that 
the experimenter makes choice Ao is 

b(l — a) ab _ b 

a + b ' a + b a + b 

Thus we see that the limiting probability that the subject will make 
response Ro is equal to the limiting probability that the experimenter 
will choose Ao. From these limiting probabilities we can find the limit¬ 
ing probability that the subject will guess correctly (see Exercise 3). 

If we assume that the experimenter makes response A 0 with prob¬ 
ability p independent of the choice of the subject, the subject can 
maximize the expected number of correct responses by always making 
response R 0 if p > J and always making Ri if p < §. (See Exercise 5.) 
The model predicts a less rational choice on the part of the subject. 
This would not seem disturbing in the case of the rat, but it would be 
hoped humans would do better. Unfortunately experiments have 
borne out that the model's predictions are approximately correct even 
with human subjects. 

The following interesting experiment was performed by W. K.Estes 
and others with many types of subjects. If the subject does Ro, he is 
rewarded half the time; if he does Ri he is never rewarded. One might 
expect that the subject will learn to do Ro, but this is not the case. 
What does the theory predict? If Ro is chosen, reward follows half 
the time. Hence a = §. If Ri is chosen, reward never follows. Hence 
1 — 5 = 0 or 6 = 1 . The theory predicts a limiting probability of 
b/(a + b) = § for the subject to choose R 0 , which is in good agree¬ 
ment with experimental results. 

We next consider an absorbing case. Specifically, we consider the 
case a = Oandfr = 1. This means that the experimenter always does 
Ao. The matrix of transition probabilities here is 

(10 0 \ 

P = I 0 1-0 0 1 

V 2 20(1 - 0 ) (1 - 0 ) 2 / 
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We shall use the methods developed in Section 4 to study this 
Markov chain. We have one absorbing state, namely, 0. Thus we 
know that the process will eventually enter this state and remain 
there. Being in this state means, by assumption A of the previous 
section, that the subject is sure to make response R 0 . Thus being 
absorbed can be interpreted as the subject “learning” that the ex¬ 
perimenter always does A 0 . 

We have seen that in an absorbing Markov chain it is possible to 
find the expected number of times that the process will be in each 
of the states before being absorbed, assuming some given starting 
state. Let Uj be the expected number of times the process will be in 
stated if it starts in state i. Before calculating Uj we consider what the 
knowledge of these quantities would tell us about the experiment. 
We observe that every time the process is in state 1, the subject 
chooses Ri with probability | and hence makes a wrong response with 
probability §. Every time the process is in state 2, the subject is sure 
to make response R\ , that is, to make a wrong response. Thus the 
expected number of wrong responses that the subject will make before 
learning is 

(1) \tn + U 2 for i = 1, 2, 


assuming that the process starts in state i. 

We find the Uj as in Section 4. We first form the truncated matrix 
Q obtained from P by omitting the column and the row corresponding 
to the absorbing state. 

Q= ( i-® o \ 
v \20(i - e) (l - 0)7 

We then find (I — Q)~ l to be 

(t t) - a - ®- 


5 o \ 

20.11,0) _ l _ I 

\0(2 - 6) 6(2 -6)f 


Then from (1) we obtain 1/2 6 as the expected number of wrong 
responses if the process begins in state 1, and 1/6 as the expected 
number of wrong responses if the process begins in state 2. 

Of course it is true that in an actual experiment the starting state 
would not be known. However, it is not unreasonable to assume that 
on the first experiment the stimuli elements are connected at random. 
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This would mean that the process starts at state 0 with probability 
i, at state 1 with probability and at state 2 with probability 
Thus under this assumption the expected number of wrong responses 
before learning is 

(2) I.1+1.1 = 1. 

w 2 20 ^ 4 0 20 

EXERCISES 

1. Prove that the matrix P in Section 5 is regular if and only if a, 6, 6, and 
1 — 6 are all different from zero. {Hint: Show that if any one of the quantities 
is 0, the chain is not regular.) 

2. Verify that the probability that the subject makes response Ro is 
b/(a + b) by finding 1 -p Q + §-pi + 0-p 2 . 

3. Show that the limiting probability that the subject’s choice agrees with 
that of the experimenter is 

cf-(l — 6) T Ml — fl) 

a *T" b 

4. Assume that the experimenter always chooses Ao with a fixed prob¬ 

ability p, independent of the choice of the subject. What proportion would 
the subject expect to get correct? [Ans. 1—2 p 2 p 2 .] 

5. Suppose under the conditions of Exercise 4 that the subject were 
always to make response R 0 . Show that if p > i, then on the average the 
subject will do better by this method than by the method predicted by the 
model. 

6. Consider the case a = J, b = 0, and 6 — For each possible starting 
state find the expected number of times that the process will be in each of 
the states before being absorbed. [Ans. too — 3; ^oi = 2; t 10 = t n = 3.] 

7. Do the same as in Exercise 6, for the case a — 0, and 6 = 0. 

8. In Exercises 6 and 7 find the expected number of incorrect responses 
that the subject will make, assuming each possible starting state. 

[Ans. 4,2,0; 0,0,0.] 

9. In Exercises 6 and 7 find the expected number of incorrect responses 
that the subject will make assuming random connections for the stimuli 
elements on the first experiment, as in (2). 

10. If the subject chooses R 0y he is rewarded with probability p. If he 
chooses R\ y he is never rewarded. (See the example with p = \ in the text 
above.) Find a and 6. What is the limiting probability that the subject 
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chooses Ro? How often is he rewarded? How often would he be rewarded 
if he always chose Ro ? Compare these two values for p = f, 

[Ans. 1/(2 - p); p/(2 - p); p.] 

11 . Compute p 0 , pi, p 2 for the cases given in Section 5, Exercises 4-9. 
For the regular matrices verify that these are the limiting probabilities there 
obtained. What do p 0 , Pu P 2 mean for the absorbing chains? 

7. MARRIAGE RULES IN PRIMITIVE SOCIETIES 

In some primitive societies we find rigid rules as to when marriages 
are permissible. These rules are designed to prevent very close rela¬ 
tives from marrying. The rules can be given precise mathematical 
formulation in terms of permutation matrices. Our discussion is 
based, in part, on the work of Andrd Weil and Robert R. Bush. 

The marriage rules we find in these societies are characterized by 
the following axioms. 

Axiom 1. Each member of the society is assigned a marriage type. 

Axiom 2 . Two individuals are permitted to marry only if they are of 
the same marriage type. 

Axiom 3 . The type of an individual is determined by the individual's 
sex and by the type of his parents. 

Axiom 4- Two boys (or two girls) whose parents are of different types 
will themselves be of different types. 

Axiom 5 . The rule as to whether a man is allowed to marry a female 
relative of a given kind depends only on the kind of rela¬ 
tionship. 

Axiom 6. In particular, no man is allowed to marry his sister. 

Axiom 7. For any two individuals it is permissible for some of their 
descendants to intermarry. 

Example. Let us suppose that there are three marriage types, t h 
£2, £3. Two parents in a given family must be of the same type, since 
only then are they allowed to marry. Thus there are only three logical 
possibilities for marriages. For each case we have to state what the 
type of a son or a daughter will be. 
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Type of both 
parents 

k 

t% 

tz 


Type of their 
son 

U 

tz 

k 


Type of their 
daughter 
tz 

k 

k 


We must verify that all the axioms are satisfied. Some of the axioms 
are easy to check (see Exercise 1), others are harder to verify. We 
will prove a general theorem which will show that this rule satisfies 
all the axioms. 


In order to give a complete treatment to this problem, we must 
have a simple systematic method of representing relationships. For 
this we use family trees, as drawn by anthropologists. The following 
symbols are commonly used: 


A 

Male 

O 

Female 

— 

Marriage 

1 

Descendant 

1-1 

1 Sibling 


In Figure 9 we draw four family trees, representing the four kinds 
of first-cousin relationships between a man and a woman. 



(a) (b) (c) (d) 

Figure 9 


Example (continued). Does our rule allow marriage between a 
man and his father’s brother’s daughter? This is the relationship in 
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Figure 9(a). There are three possible types for the original couple 
(the grandparents) and in Figure 10 we work out the three cases. We 



Figure 10 

find in each case that the man and woman are of different type, hence 
such marriages are never allowed. Can a man marry his mother’s 



Figure 11 


brother’s daughter? This is the relationship in Figure 9(d). The three 
cases for this relationship are found in Figure 11. We find that such 
marriages are always allowed. 

We are now ready to give the rules a mathematical formulation. 
The society chooses a number, say n, of marriage types (Axiom 1). 
We call these ti, fe, . . ., t n . Our rule has two parts, one concerning 
sons, one concerning daughters. Let us consider the marriage type 
of sons. The parents must be of the same marriage type (Axiom 2). 
We must assign to a boy a type which depends only on the common 
type of his parents (Axiom 3). If his parents are of type U, he will 
be of type tj. Furthermore, if some other boy has parents of a type 
different from ti, then the boy will be of type different from tj (Axiom 
4). This defines a 'permutation of the marriage types (see Chapter V, 
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Section 10); the type of a son is obtained from the type of his parents 
by a permutation specified by the rule of the society. Hence we form 
the type vector t = (h, . . . , t n ) and represent the permutation in 
question by the n X n permutation matrix S. If the type of the 
parents is component i of t , the type of their sons is component i of 
tS. By a similar argument we arrive at the permutation matrix D 
giving the type of daughters. 

We have shown that the mathematical form of the first four axioms 
is to introduce the row vector t and the two permutation matrices 
S and D. The last three axioms restrict the choice of S and D . This 
will be considered in the next section. 

We have repeatedly seen how the vector and matrix notation 
allows us to replace a series of equations by a single one. In the 
present problem this notation allows us to work out a given kind of 

relationship for all marriage types in 
a single diagram. As a matter of fact, 
this can be done without knowing how 
many types there are in the given 
society, or knowing what the rules are. 
Let us illustrate this in terms of Fig¬ 
ure 11. The couple at the top of the 
tree is of a given type, represented by 
our vector t Their son is of type tS, 
their daughter of type tD. Then the 
son of a son is of type tSS, the son’s 
daughter is of type tSD , etc. We arrive 
Figure 12 at the single vector diagram of Figure 

12. If in this figure we take t to have 
three components, then the diagram is a shorthand for the three 
diagrams of Figure 11. 



Example (continued). Our t vector is (£l,£ 2 ,4) and 



1 0 \ /0 0 1 \ 

0 1 ), >S — ( 1 0 0 ). 

0 0 / \0 1 0 / 


We know from Figure 11 that a man is always allowed to marry his 
mother’s brother’s daughter. Can we see this in Figure 12? The mar¬ 
riage will always be permitted if tDS always equals tSD, which is 




Sec. 7] 


APPLICATIONS TO BEHAVIORAL SCIENCE 


347 


equivalent to the matrix equation DS = SD. It so happens for our 
S and D that this equation is correct. But we can see more from 
Figure 12. No matter how many types there are, this kind of marriage 
will be permitted if and only if SD = DS y i.e., if the two matrices 
commute. 

We have now seen one example of how the nature of S and D deter¬ 
mines which kinds of relatives are allowed to marry. This question 
will be the subject of the next section. 

EXERCISES 

1. In the example above, verify that the rule satisfies Axioms 1, 3, and 4. 

2. In the example above, verify that the matrices S and D given represent 
the rule given. 

3. Construct a diagram for the brother-sister relationship. 

4. Using the diagram of Exercise 3, show that, in the above example, 
brother-sister marriages are never permitted. 

5. Find the condition on S and D that would always allow brother-sister 

marriages. [Ans. S — D.] 

In the Kariera society there are four marriage types, assigned according 


to the following rules: 

Parent type 

Son type 

Daughter type 

ti 

tz 

U 

£2 

£4 

tz 

tz 

ti 

£2 

u 

£2 

£1 

Exercises 6-11 refer to this society. 



6. Find the t, S, and D of the Kariera society. 

7. Show that brother-sister marriages are never allowed in the Kariera 
society. 

8. Show that S and D commute. What does this tell us about first-cousin 
marriages in the Kariera society? 

9. Show that first cousins of the kinds in Figure 9(a) and (b) are never 
allowed to marry in the Kariera society. 
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10. Show that first cousins of the kind in Figure 9(c) are always allowed 
to marry in the Kariera society. 

11. Find the group generated by S and D of the Kariera society. (See 
Chapter V, Section 11.) 

In the Tarau society there are also four marriage types. A son is of the 
same type as his parents. A daughter’s type is given by: 

Parent type 

tx 

k 

tz 

u 

Exercises 12-17 refer to this society. 

12. Find the t , S, and D of the Tarau society. 

13. Show that brother-sister marriages are never allowed in the Tarau 
society. 

14. Show that S and D commute. What does this tell us about first-cousin 
marriages in the Tarau society? 

15. Show that first cousins of the kinds in Figure 9(a) and (b) are never 
allowed to marry in the Tarau society. 

16. Show that first cousins of the kind in Figure 9(c) are never allowed to 
marry in the Tarau society. 

17. Find the group generated by S and D of the Tarau society. (See 
Chapter V, Section 11.) 

8. THE CHOICE OF MARRIAGE RULES 

In the last section we saw that the marriage rules of a primitive 
society are determined by the vector t and the matrices S and D. 
The axioms make no mention of the number of types, and indeed, we 
will find that we can have any number of types, as long as n > 1. 
But we will find that the choice of S and D are severely limited. This 
shows that the rules of existing primitive societies required consider¬ 
able ingenuity for their construction. 

We must now consider the last three axioms. For Axiom 5 we need 
a simple way of describing a kind of relationship. The family tree is 
our basic tool, but we want to replace the family tree by a suitable 
matrix. 


Daughter type 
U 

tx 

k 

tz 
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Let us consider Figure 12. Instead of starting with the grand¬ 
parents and finding the types of the grandson and the granddaughter, 
we could start with the grandson, work up to the grandparents, and 
then down to the granddaughter. For this we must consider how we 
work “up.” If a parent is of type t, the son is of type tS. Hence, 
if the son is of type t , then the parent is of type tS~ l (see Chapter V, 
Section 10). Similarly, if a daughter has type t, her parents have type 
tD -1 . In Figure 13 we find the 
new version of Figure 12. 

It* is easily seen that we can 
follow this procedure for any re¬ 
lationship. Given a kind of rela¬ 
tionship, it determines a matrix 
M such that if the male of the 
relationship is of type t, then the 
female is of type tM . From Figure 
13 we see that for “mother’s bro¬ 
ther’s daughter” M = S _1 Z) _1 &D. 

We will speak of M as the matrix 
of the relationship. These matri¬ 
ces are all products of S , D, and 
their inverses, hence each matrix is an element of the group genera¬ 
ted by S and D. 

Let us consider Axiom 5. Given any kind of relationship between 
a man and a woman, we form the matrix of the relationship M. The 
man will be permitted to marry this relation of his if and only if his 
type is the same as hers, i.e., if a certain component of t is the same 
as the corresponding component of tM. This means that this com¬ 
ponent is left unchanged by the permutation M , which proves our 
first theorem. (See Chapter V, Section 11.) 

Theorem 1 . A man is allowed to marry a female relative of a 
certain kind if and only if his marriage type does not belong to the 
effective set of the matrix of the relationship. 

A second result follows from this theorem easily. 

Theorem 2. Marriage between relatives of a given kind is always 
permitted if the matrix of the relationship has an empty effective set; 
it is never permitted if the matrix has a universal effective set. 
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Theorem 3. Axiom 5 requires that in the group generated by S 
and D every element except 7 is a complete permutation. 

Proof. The axiom states that for a given relationship the marriage 
must always be allowed or must never be allowed. Hence, by Theo¬ 
rem 2, the matrix of the relationship must have an empty effective 
set or a universal one. The former means that the matrix is 7, the 
latter that it is a complete permutation (see Chapter V, Section 11). 
Hence the matrix of every relationship must either be 7 or a complete 
permutation matrix. The matrices are elements of the group gener¬ 
ated by S and D. And given any element of this group, which can 
be written as a product of S ’s and D’ s, we can draw a family tree 
having this matrix. Hence the matrices of relationships are all the 
elements of the group. This means that all the elements of the group, 
other than the identity, must be complete permutations. This com¬ 
pletes the proof. 

Theorem 4. Axiom 6 requires that S^D be a complete permuta¬ 
tion. 

This theorem is an immediate consequence of the fact that the 
matrix of the brother-sister relationship is S~ X D. 

Theorem 5. Axiom 7 requires that for every i and j there be a 
permutation in the group which carries U into tj. 

Proof . Let us choose two individuals, one of type U and one of 
type tj. There must be a descendant of the former who can marry 
a descendant of the latter. Hence the two descendants must have the 
same type. This means that we have permutations Mi and M 2 such 
that U is carried by Mi into the same type as t§ by M 2 . Then 
carries U into tj. Hence the theorem follows. 

We have now translated Axioms 5-7 into the following three con¬ 
ditions on S and D: (1) The group generated by S and D consists 
of 7 and of complete permutations. (2) $ _1 Z) is a complete permuta¬ 
tion. (3) For every pair of types there is a permutation in the group 
that carries one type into the other. 

Definition. A permutation group is called regular if (a) it is com¬ 
plete, i.e., every element of the group other than 7 is a complete 
permutation and if (b) for every pair from among the n objects 
there is a permutation in the group that carries one into the other. 
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Basic theorem. To satisfy the axioms we must choose two differ¬ 
ent n X n permutation matrices S and D which generate a regular 
permutation group. 

Proof . Conditions (1) and (3) above state precisely that the group 
generated by S and D be regular. In a regular group every element 
other than 7 is a complete permutation; hence condition (2) requires 
only that S~ X D be different from 7. Since S~ l D — I is equivalent to 
D = S, we need only require that D ^ S. This completes the proof. 

It is important to be able to recognize regular permutation groups. 
Here we are helped by a very simple, well-known theorem: A sub¬ 
group of the group of permutations of degree n is regular if and only 
if it has n elements and is complete. 

This leads to a relatively simple procedure. We choose n. Then we 
must pick a group of n X n permutation matrices which has n ele¬ 
ments and is complete, and select two different elements which gener¬ 
ate the group. This is always possible if n > 1 (see Exercise 11). One 
of these is chosen as S and one as 7). Since there are not very many 
regular permutation groups for any n, the choice is very limited. 

Example. Let us find all possibilities for a society having four 
marriage types. First of all we must find the regular subgroups of the 
symmetric group of degree 4, i.e., the groups of permutations on four 
objects that have four elements and are complete. 

Among these we find cyclic groups. Any two of these groups have 
the same structure and hence lead to equivalent rules. Let us sup¬ 
pose that we choose the permutation group generated by 

/0 1 0 0 \ 

P = 0 0 1 0 

0 0 0 1 * 

\1 0 0 0 / 

The group consists of P, P 2 , P 3 , and 7. Either P or P 3 generates the 
group, and they play analogous roles. We may therefore assume that 
P is one of the two permutations chosen. This allows us (P,P 2 ), 
(P,P 3 ), and (P,7) as possibilities. We must still ask which is S and 
which is 7). In the second case it makes no difference, since P and P 3 
play analogous roles in the group, but there is a difference in the first 
two cases. This leads to five possibilities: 
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1 . 

2 . 

3. 

4. 

5. 


S = P, D = P 2 

£ - P 2 , D = P 

5 = P, D = P 3 

S = P, Z> = I 

S = Ij D = P. This is the Tarau society. 


There is only one noncyclic complete subgroup with four elements, 
consisting of I and the three permutations which interchange two 
pairs of elements. In this group we have essentially only one case, 
since all three permutations play the same role. 


6. The Kariera society. (See exercises after the last section.) 

Two of these six possibilities are actually exemplified in known 
primitive societies. 


EXERCISES 

1. Figure 13 shows the matrix of one of the first-cousin relations. Find 
the matrices of the other three first-cousin relationships. 

2. Prove that marriage between relations of a certain kind is permitted 
if and only if the matrix of the relation is I. 

3. Use the result of Exercise 2 to prove that no society allows the marriage 
between cousins of the types in Figure 9(a) and (b). 

4. Which of the six rules described above (in the example) allow marriage 

between a man and his father's sister's daughter? [Ans. 3, 6.] 

5. Show that all six rules given in the example above allow marriages 
between a man and his mother's brother's daughter. 

6. There are eight kinds of second-cousin relationships between a man 
and a woman. Draw their family trees. 

7. Find the matrices of the eight second-cousin relationships. 

8. Are there any second-cousin relationships for which marriage is for¬ 
bidden by all possible rules? [Ans. Yes.] 

9. Test the second-cousin relationships (other than those found in Exer¬ 
cise 8) for each of the six rules given in the example above. 

10. For n objects, consider the permutation that carries object number i 
into position i + 1, except that the last object is put into first place. Show 
that the cyclic group generated by this permutation is regular. 



Sec. 9] 


APPLICATIONS TO BEHAVIORAL SCIENCE 


353 


11. Use the result of Exercise 10 to show that a society can have any 
number of marriage types, as long as the number is greater than one. 

12. In the Example of Section 7, prove that S and D generate a regular 
permutation group. 

13. Prove that the following matrices lead to a rule satisfying all axioms. 


S = 


/° 

1 

0 

0 

0 

0\ 


r 

0 

0 

1 

0 

°\ 

0 

0 

1 

0 

0 

0 


0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

, D - 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

1 

0 


1 

0 

0 

0 

0 

0 

0 
i a 

0 

a 

0 

A 

0 
-1 ■ 

0 

A 

1 


0 

L a 

0 

1 

1 

A 

0 

A 

0 

A 

0 

A i 


14. Prove that the rule given in Exercise 13 allows no first-cousin marriages. 


9. MODEL OF AN EXPANDING ECONOMY 

The following model is a modification of a model proposed by John 
von Neumann. It is designed to study an economy which is expanding 
at a fixed rate, but which is otherwise in equilibrium. The model 
makes certain assumptions about how an economy behaves in equi¬ 
librium. These assumptions are idealizations, and it is to be expected 
that the model will eventually be replaced by a better model. For 
the present many economists consider the von Neumann model to be 
a reasonable approximation of reality. Our interest in the model is 
purely to illustrate how finite mathematics is used in an economic 
problem. 

The economy is described by n goods and m processes. A good may 
be steel, coal, houses, shoes, etc. Goods are the materials of produc¬ 
tion in the economy. Each good may be measured in any convenient 
units, as long as the units are fixed once and for all. It is convenient 
to be able to talk of arbitrary multiples of these units; e.g., we will 
consider not only 2.75 tons of steel but also 2.75 houses. The latter 
may be interpreted as an average. 

A manufacturing process needs certain goods as raw materials (the 
inputs) and produces one or more of our goods (the outputs). As a 
process we may, for example, consider the conversion of steel, wood, 
glass, etc. into a house. Of course this process may be used to manu¬ 
facture more than one house, and hence we have the concept of the 
intensity with which a process is used. One of the basic assumptions 
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is one of linearity, i.e., that k houses will require k times as much of 
each raw material. Thus we choose an arbitrary “unit intensity” for 
each process, and the process is completely described if we know the 
inputs necessary for this unit operation and the outputs produced. 

Process number i when operating at unit intensity will require a 
certain amount of good j as an input. This amount will be called ay. 
(In particular, if good j is not needed for process i 7 then ay = 0.) 
We will call by the amount of good j produced by process i. Here we 
allow a process to produce several different goods (e.g., a principal 
output and by-products). But, of course, we allow processes that 
produce only one good. Then all the by for this i will be 0, except for 
one. The ay and by are nonnegative numbers. 

We define the matrix A to be the m X n matrix having components 
ay, and B to be the m X n matrix with components by. Then the 
entire economy is described by these two matrices. 

We must still consider the element of time. It is customary to 
think of the economy as working in stages or cycles. In one such 
stage there is just time enough for process i to convert the inputs ay 
to outputs by. Then in the next stage, these outputs may in turn be 
used as inputs. The length of this cycle may be any time interval 
convenient for the study of the particular economy. It may be a 
month, a year, or a number of years. 

Example. Let us take as our economy a chicken farm. Our goods 
are chickens and eggs, with one chicken and one egg being the natural 
units. Our two processes consist of laying eggs and hatching them. 
Let us assume that in a given month a chicken lays an average of 
12 eggs if we use it for laying eggs. If used for hatching, it will hatch 
an average of four eggs per month. From this information we can 
construct A and B . 

Our cycle is of length one month. Good 1 is “chicken,” good 2 is 
“egg,” process 1 is “laying,” and process 2 is “hatching.” The unit 
of intensity of a process will be what one chicken can do on the 
average in a month. The input of process 1 is one chicken, i.e., one 
unit of good 1. The output will consist of a dozen eggs plus the origi¬ 
nal chicken. (We must not forget this, since the original chicken can 
be used again in the next cycle.) Hence the output is one unit of 
good 1 and 12 units of good 2. In process 2 the inputs are one chicken 



Sec. 9] 


APPLICATIONS TO BEHAVIORAL SCIENCE 


355 


and four eggs, while the output consists of five chickens (the original 
one plus the four hatched). Hence our matrices are 

Chicken Egg Chicken Egg 

Laying eggs: . = A 0\ „ = /1 12\ 

Hatching eggs: \1 4/ J \5 0/ 

Suppose that our farmer starts with three chickens and eight eggs 
ready for hatching. He will need two chickens for hatching the eight 
eggs, and this leaves him one for laying eggs. Hence he uses process 1 
with intensity 1, process 2 with intensity 2. We symbolize this by the 
vector x = (1,2). Note that his inputs are the components of xA. 
His one laying chicken will lay 12 eggs. He will end up with his origi¬ 
nal three chickens plus eight new ones. Hence he will have an output 
of 11 units of good 1 and 12 units of good 2. These are the components 
of xB. Of his 11 chickens only three can be used for hatching, hence 
he will employ intensities (8,3). The outputs will be (8,3)2? = (23,96), 
as can easily be checked (see Exercise 1). He now has 96 eggs and 
only 23 chickens, so that some eggs must go unhatched. 

On the other hand, suppose that he starts with only two chickens 
and four eggs. He will then use intensity (1,1). His laying chicken 
lays 12 eggs, and with four newly hatched chickens he has a total of 
six chickens. This result is also given by (1,1)5 = (6,12). He now 
has tripled both his chickens and his eggs. He can use intensity (3,3) 
on the next cycle, yielding (3,3)5 = (18,36), which again triples both 
the chickens and the eggs. Thus he can continue to use the same pro¬ 
portion of the processes, and will continue to triple his output on every 
cycle. This economy operates in equilibrium . 

As was seen in the example, the natural way to represent the in¬ 
tensities of our processes is by means of a row vector. Let Xi be the 
intensity with which process number i is operated, then the intensity 
vector x is (xi, . . ., x m ). Matrix multiplication is then an easy way 
of finding the total amount of each good needed, and the totals pro¬ 
duced. Component j of xA is the sum Xxttiy + . . . + Xmd m j) where 
Xiciij is the amount of good j we are using in process 1, x 2 « 2 y the amount 
we use in process 2, etc. Hence the jth. component of xA is the total 
amount of good j needed in the inputs. Similarly, xB gives the total 
amounts of the various goods in the outputs. 
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We must now introduce prices for the various goods. Let y 3 be 
the price of a unit of goody; this must be nonnegative, but it may be 
zero. (The latter represents a good that is so cheap as to be “prac¬ 
tically free.”) It is assumed that k units of goody will cost ky 3 . The 

M 


price vector y is the column vector 


. Let us consider the prod¬ 



ucts Ay and By. In Ay the ith element is anyi + . . . + a in y n \; the 
product anyi is the amount of good 1 needed for unit operation of 
process i multiplied by the per unit price of good 1, hence this is the 
cost of good 1 used in the process, day* is the cost of good 2 used, etc. 
Hence the ith component of Ay is the total cost of inputs for a unit 
intensity operation of process i. Similarly, By gives the cost (value) 
of the outputs. 

Finally, we consider the products xAy and xBy. Since £ is 1 X ra, 
the matrices m X n, and y is n X 1, each product is 1 X 1—or a 
number. An analysis similar to those above shows that xAy is the 
total cost of inputs if the economy is operated at intensity x, with 
prices y , and xBy is the total value of all goods produced. (See Ex¬ 
ercise 2.) 


Example (continued). Suppose that a chicken costs 10 monetary 
units, while an egg costs 1 unit; then y — Here 

Ay = (u) and By = (so)' 

This means that process 1, laying eggs, multiplies our investment by 
a factor of 2.2; while process 2, hatching, brings in over $3.50 for 
every $1.00 invested. There will be pressure to use the hens just for 
hatching—which will create a shortage of eggs, bringing about a 
drastic change in prices. Suppose now that a chicken costs only six 

times as much as an egg, i.e., y = Then 
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In this case each process triples our investment, and there will be no 
undue monetary pressure. Hence the farmer can set up his processes 
so as to be in equilibrium, and the price structure will be stable. 

The remaining factor to be considered is the expansion of the econ¬ 
omy. We assume that everything expands at a constant rate, i.e., 
that, there is a fixed expansion factor a such that if the processes oper¬ 
ate at intensity x in this cycle, they operate at intensity ax during 
the next cycle, a 2 x after that, etc. There is also something similar 
to expansion for the money of the economy, namely, that through 
bearing interest, y units of money in this cycle will be worth (3y units 
after the cycle. We again assume that the interest factor (3 is fixed once 
and for all in equilibrium. Usually these factors will be greater than 
1, but this does not have to be the case. Thus a = 1 represents a 
stationary economy, and a < 1 represents a contracting economy. 

This completes the survey of the basic concepts. We must now lay 
down our assumptions concerning the behavior of an economy which 
is in equilibrium. These assumptions serve as axioms for the system. 

First of all, we must assure that we produce enough of each good 
in each cycle to furnish the inputs of the next cycle. If in a given 
cycle the economy functions at intensity x , it will function at ax next 
time. The outputs this time will be xB , while the inputs next time 
will be ax A ; hence we must require 

Axiom 1 . xB > ax A. 

(When we write a vector inequality, we mean that the inequality 
holds for every component.) We will of course have to require similar 
conditions for the future. For example, in the second cycle the out¬ 
puts are axB , and the inputs needed for the third cycle are a 2 xA. 
But when we write the condition that the former be greater than the 
latter, an a cancels, and we have again the same condition as in 
Axiom 1. Hence this axiom serves for all cycles. 

The first condition assures that it is possible for the economy to 
expand at the constant rate a . We must also assure that the economy 
is financially in equilibrium. Suppose that the output of some process 
was worth more than (3 times the input. Then we would be prepared 
to pay interest at a larger rate to some one willing to invest in our 
process. Hence would increase. Thus, in equilibrium this must not 
be possible; no process can produce profits at a rate greater than that 
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given by investment. If we operate processes at a unit intensity, then 
Ay gives the costs of inputs, while By gives the cost of outputs. The 
latter cannot exceed the former by more than a factor fi for any 
process 

Axiom 2. By < j 3Ay. 

The next assumption concerns surplus production. If we produce 
more of a given good than can be used by the total economy, the 
price drops sharply as merchants try to get rid of their produce. 
It is customary to assume, for the sake of simplicity, that such 
goods are free, i.e., to give them price zero. The vector difference 
xB — ax A = x(B — aA) gives the amounts of overproduction, i.e., 
the jth component is positive if and only if good j is overproduced. 
If we assign price zero to these goods, then in the product of the above 
vector with y every nonzero factor of the former is multiplied by 
zero; hence the product of the two vectors will be 0. 

Axiom 8 . x(B — aA)y = 0 . 

Now we turn to the question of whether a given process is worth 
undertaking. From Axiom 2 we know that no process can yield more 
profit than investment can. But if it yields any less, it is bettor not 
to use it, but rather to invest our money. Hence in Axiom 2 wo form 
the difference By — /3Ay ; if the ith component of this is negative, 
process i should not be used; it must be assigned intensity 0. Similar 
to the argument used for Axiom 3, this shows that multiplying this 
vector difference by x must yield zero. 

Axiom 4 * x(B — fiA)y = 0 . 

Our final assumption is that something worth while is produced in 
the economy, i.e., that the value of all goods produced is a positive 
amount. 

Axiom 5 . xBy > 0 . 

If for a given economy (given A and B) we find vectors x and y and 
numbers a and 0 which satisfy these five axioms, we say that we have 
found a 'possible equilibrium solution for the economy. 

Example (continued). We have already seen that if x = (1,1), the 
economy expands at the fixed rate a = 3. We can now check that 
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Axiom 1 is satisfied. Actually, xB turns out to equal ax A. Similarly, 


we have noted a monetary equilibrium if y 



and each process 


multiplies the money put into it by a factor of (3 = 3. We can check 
that Axiom 2 holds. Actually By is equal to $Ay in this case. From 
these two equations we also know that x(B — a A) and (.B — &A)y 
are identically 0; hence Axioms 3 and 4 hold. Finally, xBy = 48; the 
total value of goods produced is positive, so that Axiom 5 holds. 
Therefore these values of x, y, a , and ($ represent an equilibrium for 
the economy. It can also be shown that these are the only possible 
values of a and and that x and y must be proportional to those 
shown here (which may be thought of simply as a change in the units). 


In our example we found one and only one equilibrium for the 
economy, and we found that a = /3. This raises several very natural 
questions: (1) Is there a possible equilibrium for every economy? 
(2) If yes, then is there only one? (3) Must the expansion factor al¬ 
ways be the same as the interest factor? In the next section we will 
establish the following answers: (1) For every economy satisfying a 
certain restriction (which is certainly satisfied for all real economies) 
there is a possible equilibrium. (2) There may be more than one 
equilibrium, though the number of different possible expansion factors 
is finite. (In the example there is essentially only one possibility for 
x and y ; however this is not true in general.) (3) The interest and 
expansion factors are always equal in equilibrium. 


EXERCISES 

1. In the example, for x = (1,2), verify for three cycles that xA and xB 
give the correct inputs and outputs. 

2. Give an interpretation of xAy and xBy: 

(a) Using the interpretations of xA and xB given above. 

(b) Using the interpretations of Ay and By given above. 

(c) And show that the results in (a) and (b) are the same. 

3. In the example suppose that two chickens lay eggs and three hatch 

eggs. Find x, xA, and xB. Substitute these quantities into Axiom 1, and find 
the largest possible expansion factor. [Arts, a = 2.] 

4. In the example, suppose that chickens cost 80 cents and eggs cost five 
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cents. Find y, Ay , and By. Substitute these quantities into Axiom 2 , and 
find the smallest possible interest factor. [Ans. ft = 4 .] 

5. Show that the x, y, a, and f6 found in two previous Exercises do not 
lead to equilibrium, by showing that Axioms 3 and 4 fail to hold. 

6 . Show that if a = = 3, then the only possible x’s and y ’s are pro¬ 

portional to those given above. (Hint: show that the axioms force us to 
choose xi = X 2 and 2/1 = 62 / 2 .) 


The remaining problems refer to the following economy: On a chicken 
farm there is a breed of chicken that lays an average of 16 eggs a month, and 
such that they can hatch an average of 3| = eggs. 

7. Set up the matrices A and B. 

8. Suppose that three chickens lay and five chickens hatch. Find x , xA, 
and xB. What is a? [Ans. x = (3,5); xA = (8,16); xB = (24,48); a = 3.] 

9. Suppose that chickens cost 40 cents and eggs five cents. Find y, Ay , 
and By. What is 0? 

10. Verify that the x , y, a, and /3 found in the previous exercises represent 
an equilibrium for the economy, by substituting these into the five axioms. 

11 . Suppose that we start with 16 chickens and 32 eggs. Choose the in¬ 
tensities so that the economy will be in equilibrium, and find what happens in 
the first three months. [Ans. x = (6,10); 432 chickens, and 864 eggs.] 

12. Suppose that with 16 chickens and 32 eggs (see Exercise 11) we start 
out by having only five hatching, the others laying. Show that we cannot 
have as many chickens tfter three months as we would have in the equi¬ 
librium solution. 

10. EXISTENCE OF AN ECONOMIC EQUILIBRIUM 

We must ask whether the axioms can always be satisfied, i.e., 
whether the model of the economy allows such an equilibrium. 

Of course we are interested only in an economy that could realty 
occur. That means that these goods must be goods that are somehow 
produced, and that they cannot be produced out of nothing. Hence 
every process must require at least one raw material and every good 
has at least one process that produces it. We summarize this: 

Restriction . Every row of A and every column of B has at least one 
positive component. 
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Theorem. If A and B satisfy the restriction, then an equilibrium 
is possible. 

We will sketch the proof of this theorem. From Axiom 3 we 
have that xBy = axAy , while from Axiom 4, xBy = fixAy. Hence 
axAy = PxAy. Furthermore, from Axiom 5 we know that xBy is not 
zero, hence xAy is not zero. Then a = ft. Hence in equilibrium the 
rate of expansion equals the interest rate . 

If a = /3, then Axioms 3 and 4 are equivalent. We can also rewrite 
the first two axioms (using our result): 

Axiom 1 '. x(B — aA ) > 0. 

Axiom 2'. (B — aA)y < 0. 

If we multiply the first inequality by y on the right, and the second 
by x on the left, we see that Axiom 3 (and hence 4) follows from these 
two axioms. Hence we need only worry about Axioms 1', 2', and 5. 

The key to the proof is to reinterpret the problem as a game- 
theoretic one. This is done in spite of the fact that no game is in¬ 
volved in the model. We simply use the mathematical results of the 
theory of games as tools. 

Axioms 1' and 2' suggest that we think of the matrix B — aA as a 
matrix game. We would then like to think of the vectors x and y 
as mixed strategies for the two players. The vectors are nonnegative, 
but the sum of their components need not be 1. However, we know 
that multiplying x by a constant can be thought of as a change in 
the units of intensities, and multiplying y by a constant is equivalent 
to a change in the units of the various goods. Hence, without loss 
of generality, we may assume that x and y have component sum 1, 
and think of them as mixed strategies. If we do this, the two axioms 
state precisely that the game has value zero, and that x and y form 
a pair of optimal strategies for the two players. Thus our first prob¬ 
lem is to choose a so that the “game” B — aA has value zero. 

Example 1. Let us set up the example of the last section as a game. 



If we choose x = (|,|) as a mixed strategy for the row player, then 
xM = [3 — a, 2(3 — a)]. If a < 3, the components are both posi- 
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tive; hence the game has value greater than zero. If we choose 
y — as a mixed strategy for the column player, then 

My = T 6(3 " a) l- 

V J_10(3 - «)] 

If a > 3, both components are negative, and hence the game has 
negative value. We thus see that the only value of a that could pos¬ 
sibly give us a zero value of the game is a = 3, and we see from the 
above that in this case the value really is zero, and x and y are optimal 
strategies. (See Exercise 1.) 

We must now show that the above example is typical in that we can 
always find an a making the value of B — a A equal to zero. We may 
write this matrix as the sum B + a(—A), and think of our game as a 
combination of game B and game (—A). 

By our restriction, every column of B has a positive entry. The 
strategy vector y for the column player must have at least one positive 
component. Hence in the product By , one of the components at least 
must be positive. Hence the value of the game B is positive. Since 
every row of A has a positive entry, every row of the game — A must 
have a negative entry. Hence at least one component of x{—A) must 
be negative, and hence (—A) has a negative value. 

In the combination B + a( — A) the second term is negligible for 
very small a; hence for these the game has positive value. As a in¬ 
creases, we keep adding larger negative quantities to some of the 
entries of the game, i.e., we keep decreasing some of these entries. 
Hence the value of the game decreases steadily. For very large a the 
first term is negligible, and hence the combined game has negative 
value. For some intermediate value of a the game must have value 
zero. 

Example 1 (continued). The value of the combined game M is 
plotted for various a in Figure 14. Since B has value ^ and —A has 
value —1 (see Exercise 2), at the beginning the game M has value 
nearly and near the end it has value nearly 2 — a, which is well 

below zero (see Exercise 3). 

We know that there is at least one a for which the game B — aA 
has value zero. By choosing such an a together with a pair x, y of 
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optimal strategies, we arrive at a set of quantities satisfying Axioms 
I' and 2'. This still leaves the question of Axiom 5. 

If there are two values of a, say p < q> for which the game has 
value zero, every value between p and q also has this property. This 



is because the value of the game cannot increase as a increases, as we 
saw above. Hence we must have a situation such as that shown in 
Figure 15. It can be shown, however, that most of these values rep¬ 
resent methods of procedure where nothing worth while is produced, 
i.e., where Axiom 5 fails. For Axiom 5 to hold, different values of a 
can be achieved only by using at least one new process. Since there 
are only a finite number of processes, we can have only a finite number 
of different possible a’s on the interval between p and q. If p is the 
smallest possible expansion rate and q the largest, then p and q are 
such that Axiom 5 can be satisfied, and there may be a limited number 
of additional ones in between. 

Example 2. In the chemical industry we are interested in manu¬ 
facturing compounds P, Q, and R. We assume that the basic chemi¬ 
cals are available in plentiful supply, and that their cost can be 
neglected for this analysis. But to manufacture compound P we must 
have a unit of both P and Q available, while to manufacture Q we 
must have P and R available. Compound R is a by-product of both 
manufacturing processes. The exact quantities are given by 

P Q R P Q R 

Manufacture of P: t> ^ ® 

Manufacture of Q: \1 0 1/ \0 3 2/ 
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Then 

M = B-aA = ( Q - a -* 

\ —a 6 

Let us choose 

A 

x = (J,§) and y = I | 

\! 

Then 

xM = [3 — a, §(3 — a), J(3 — a)] and 

From this we see that if a < 3, then the row player has a guaranteed 
profit, while if a > 3, the column player does. Thus a — 3 is the 
only possibility, and for this case the value of the game is zero, and 
the vectors x and y are optimal strategies, as can be seen from the 
fact that xM and My have all components zero. Thus there is a 
unique equilibrium, with a — j3 — 3. 

We also find that the mixed strategy x is unique, which means that 
the two processes must be used with the same intensity. However, 
the strategy y is not unique. We may instead use 



or any mixture ty ' + (1 — t)y", 0 < t < 1. Our y is the case t = 
Hence we see that different price structures are possible, each leading 
to the same expansion rate. 

Example 3. This “economy” is a schematic representation of the 
production of essentials and inessentials in a society. Goods are 
lumped together into two types, E (essential goods) and I (inessential 
goods or luxury items). For the manufacture of E we need only es¬ 
sential goods (since anything so needed is essential). For the manu¬ 
facture of I we may need both types of raw materials. Let us suppose 
that our economy functions as follows. 

El El 

Manufacture of essentials: ^ = /l 0\ ^ = /4 0\ 

Manufacture of luxuries: \1 1 /' \0 2/ 
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Then 


M = B-aA = ( 4 _ “ 


° Y 

2 -a) 


With a little patience we can determine the values of M for various 
values of a y and we arrive at the curve in Figure 15. (See Exercise 4.) 
Hence a must be between 2 and 4. For a = 4, we have the optimal 


strategies x = (1,0) and y = 



which satisfy all our axioms; while 


for a = 2 we have 


x = (|,J) and ?/ = 


For in-between values of a we cannot satisfy Axiom 5. (See Exercises 
5-7.) Hence there are two possible equilibria: (1) The society can 
decide to manufacture only essentials, in which case the production 
of these will increase rapidly. (2) By putting a high enough value on 
inessentials, it will arrive at an equilibrium in which both essentials 
and inessentials are produced, but then the rate of expansion is con¬ 
siderably decreased. 


We have now provided complete answers for the three questions 
raised at the end of the last section, providing a mathematical solu¬ 
tion to a series of economic problems. 


EXERCISES 

1. In Example 1 verify that for cl = 3 the game M has value 0, and that 
the x and y given are optimal strategies. 

2. In Example 1 solve the 2 X 2 games B and -A, finding their values 
and pairs of optimal strategies. 

3. In Example 1: 

(a) Show that the game M is nonstrictly determined for every a. 

(b) Find the value of M for any a. [. Ans . (5 + a)(3 — cl)/ (4 + a).] 

(c) Show that the value for a = .01 is very near 

(d) Show that the value for a = 100 is very near -98. 

(e) Show that the value is 0 if and only if ol = 3. 

4. Find the value of M in Example 3 for a = 0,1, 2, 3,4, 5, and 6. {Hint: 
Some of these games are strictly determined.) 

[Ans, 1.33; .60; 0; 0; 0; -1.00; -2.00.] 
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5. In Example 3, for a — 4, verify that the strategies given are optimal, 
and that Axiom 5 is satisfied. 

6. In Example 3, for a = 2, verify that the strategies given are optimal, 
and that Axiom 5 is satisfied. 

7. In Example 3, for a = 3, find the unique optimal x and y, and show 
that Axiom 5 is not satisfied. Prove that the same happens for every a if 
2 <a <4. 


The remaining problems refer to the following economy: There are four 
goods and five processes, and the economy is given by 


Also let 



/O 0 1 1\ 


/0 0 4 2\ 


0 0 2 2 


0 0 5 7 

A = 

0 4 0 2 

, B = 

6 5 4 0 


2 110 


0 4 0 3, 


1 0 
* = (iiO,0,0); 


\3 0 6 

x' = (0,0,f,f,0); 



8. Verify that A and B satisfy the restriction. 


9. Compute M — B — aA. 


10. Compute xM, x'M, My , and My'. 

11. When will x'M have all positive entries? When will My' have all 
negative entries? What possibilities does this leave for a ? 

[Ans. a < 2; a > 3; 2 < a < 3.] 

12. Show that for the remaining possible values of a the game M has 
value zero, and x and y are optimal strategies. 

13. Show that for the largest possible a the vectors x and y' provide optimal 
strategies which satisfy Axiom 5. 

14. Show that for the smallest possible ol the vectors x' and y provide 
optimal strategies which satisfy Axiom 5. 

15. If a is in between its two extreme values, show that: 

(a) xM is positive in its last two components, and hence the second 
player can use only his first two strategies. 

(b) My is negative in its last three components, and hence the first 
player can use only his first two strategies. 

(c) For these cases it is impossible to satisfy Axiom 5. 
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16. Process number five is in a special position. Why? [Arts. Never used.] 

17. Use the results of Exercises 8-16 to show that there are exactly two 

possible equilibriums for this economy. Interpret each equilibrium, and point 

out the differences between the two methods of operating the economy. 

[Ans. At the price of reducing the expansion rate, the economy can 
produce a larger variety of goods. To achieve this, the additional 
types of goods must be valued (relatively) very high.] 
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